generated from amosproj/amos202Xss0Y-projname
-
Notifications
You must be signed in to change notification settings - Fork 1
Open
Labels
User StoryLabel for User StoriesLabel for User Stories
Description
User story
- As a data engineer
- I want/need to prepare an annotated dataset for NER training
- So that the NER model can be trained on accurately tagged data
Acceptance criteria
-
Select a suitable amount of Q&A pairs from the HuggingFace
- Start with 50-100 Q&A Pairs
-
Optional: Use tools like Doccano to tag entities according to the defined
-
Store the NER training dataset
- Upload the NER-annotated data in a different directory on HuggingFace
- Ensure annotated dataset can be used for the NER model
-
The list should contain objects like f.e.:
- Entity types: Project_Name, Technology_Name, (Organization_Name), ...
- Entities: Kubernetes, Docker, gRPC,...
- Relationships: Depends_On, Complements, (Conflicts_with), ...
-
Here is an example:
- "Example Text"
- Project_Name: Kubernetes, ...
- Technology_Name: Docker, gRPC, ...
- (Organization Name: Google, Red Hat, ...)
- Relationship: ...
- "Example Text"
-
Store the list in a format that can be used for the NER model training
-
As for this part of the work, it does not have to be automated but it can be automated
Definition of done (DoD)
- Bill of Materials in the planning document has been updated
- All feature branches have been merged and closed
- New feature code has been documented
- Potential new licenses have been checked
- All GitHub Actions are passing
- The requirement.txt is updated
DoD general criteria
- Feature has been fully implemented
- Feature has been merged into the mainline
- All acceptance criteria were met
- Product owner approved features
- All tests are passing
- Developers agreed to release
Metadata
Metadata
Assignees
Labels
User StoryLabel for User StoriesLabel for User Stories
Type
Projects
Status
Product Backlog