-
Notifications
You must be signed in to change notification settings - Fork 23
DAML User Guide
Data Annotator for Machine Learning (DAML) is designed to enable an end-to-end data annotation process for common data types. Here we provide a high level users guide of the key features in DAML:
- From the Projects tab, click Create New Annotation Project and choose the project type.
- Supported projects types are:
- text classification
- tabular
- named entity recognition (NER)
- log classification
- image classification
- Supported projects types are:
- Depending on the annotation project type, you will be asked different project setup questions. In general, the requirements are a project name, uploading data, label values, configuring active learning, and assigning to annotators via email. Here, we show the project set up for a NER project:
- INSERT_NER_SETUP_IMAGE
- Click Create to complete the project set up.
- You will receive an email notification confirming the project creation and this project will show up in the Projects tab
- Annotators will receive an email link to join the project and start annotating
- From the Annotate tab, click START on the project of your choice. This example will use an NER project:
- INSERT_NER_PICTURE
- On the left hand side menu (which can be toggled to hide), you have the following:
- Projects selector to switch between projects
- Project info including annotation instructions from the Project Owner
- Your Progress on the current annotation project
- A history of your labels in this session
- On the right hand side, you are presented the Original Ticket which is one entry from the overall project
- The flag icon next to this entry allows the annotator to send this entry to the Project Owner to review for fit (eg; the entry might not fit the current set of labels or is bad data)
- In an NER project, the annotator can select the entity (one of the buttons) and then click the text from the entry to highlight
- Note: a single click will annotate the clicked word or you can select a span of text to be annotated as this entity
- INSERT_IMAGE_NER_ANNOTATION
- At any time, you may skip the current entry, return to a previous entry
- Click EXIT at any time to stop annotating. Your progress is automatically saved for resumption later
In the Projects tab, choose click on the name of the project to view the overall progress:
- On the top you will see overall project details in addition to two charts:
- # Annotations Per User
- # Annotations Per Category
- Underneath the charts, you will see two tabs:
- Annotations tab which presents all currently annotated examples in a table format for your review
- Flag tab which presents all examples flagged by users for review. For a flagged ticket, you have two options:
- Delete the example from the project. This will permanently remove this example from the dataset.
- Silence the flag will return the specific example back into the pool which will be shown to annotators again
- For projects with Active Learning support, you will see an additional Active Learning tab showing the computed accuracy over time of models which are used to query annotators
Data Management: input and export formats follow best practices to enable seamless integration with ML frameworks. Data is sharable to annotators, project owners as well as service users. Data is uploaded in its original format, extracted for a particular project and retained for N days. At anytime, you can append new datasets to your annotation projects.
Active Learning: Active learning works with annotators by continuously training and improving an ML model using the most recently annotated data to query annotators to label the data that matter the most, therefore reducing the amount of labeled data to achieve similar accuracies.
Using the DAML API : DAML provides a set of common APIs to manage your data annotation projects. A swagger UI is available for easy interactivity at /api-docs/. You can easily plug in your favorite ML models as annotators using the APIs.