ANNIE is a lightweight, Python-based desktop application designed for annotating text files with named entities and directed relations. Built with Tkinter, it offers a user-friendly interface for researchers, linguists, and NLP practitioners to create high-quality annotated datasets for named entity recognition (NER) and relation extraction tasks.
- Entity Annotation: Tag text spans with customizable labels (e.g., Person, Organization, Location).
- Relation Annotation: Define directed relations between entities (e.g., "spouse_of", "works_at").
- Batch Processing: Load and annotate multiple
.txt
files from a directory. - Entity Propagation: Automatically annotate matching text spans across all files, with optional dictionary-based propagation.
- AI Pre-annotation: Use a pre-trained NER model (requires
transformers
andtorch
) for automated entity tagging. - Entity Merging/Demerging: Merge multiple mentions of the same entity or separate them via right-click.
- Relation Flipping: Reverse the direction of relations with a single click.
- Multi-label & Overlapping Annotations: Optionally allow multiple tags or overlapping annotations.
- Session Management: Save and load annotation sessions to resume work.
- Export/Import: Support for CoNLL-2003 and spaCy JSONL formats for training data.
- Color-coded Visualization: Highlight entities with tag-specific colors; propagated entities are underlined.
- Read-only Text Area: Prevents accidental modifications.
- Hotkey Support: Use keys 0-9 for quick tag selection/relabeling and
a
for AI pre-annotation. - Flexible Schema: Customize entity tags and relation types, with save/load functionality.
- Python: 3.6 or higher.
- Required Libraries:
tkinter
(included with Python),json
,os
,shutil
,pathlib
,uuid
,itertools
,re
,time
,threading
. - Optional Libraries:
transformers
andtorch
for AI pre-annotation (pip install transformers torch
).
- Clone this repository or download the source code.
- Navigate to the project directory.
- Run the application:
python annie.py
- Go to File → Open Directory and select a folder with
.txt
files. - Files load in alphabetical order; the first file displays automatically.
- Use Previous/Next buttons or click a file in the listbox to navigate.
- Add files to the session via File → Add File(s) to Session....
- Drag to select text or double-click a word in the text area.
- Choose a tag from the Entity Tag dropdown or press 0-9 to select a tag.
- Click Annotate Sel to tag the selection.
- Entities appear in the Entities list and are highlighted with tag-specific colors.
- Single-click an annotated span to remove it, or select entities and click Remove Sel.
- Enable Extend to word to snap selections to word boundaries.
- Relabel entities by selecting them and pressing 0-9.
- Select exactly two entities in the Entities list (first = head, second = tail).
- Choose a relation type from the Relation Type dropdown.
- Click Add Relation (Head→Tail) to create the relation.
- Relations appear in the Relations list.
- Select a relation and click Flip H/T to reverse it or Remove Relation to delete it.
- Go to File → Save Annotations to export annotations as a JSON file.
- Use File → Save Session to save the entire session (files, annotations, tags).
- Load sessions via File → Load Session....
- Use Settings → Manage Entity Tags... or Manage Relation Types... to add, remove, or edit tags/types.
- Save/load schemas via Settings → Save Tag/Relation Schema... or Load Tag/Relation Schema....
- Click Propagate Entities to copy entities from the current file to all files.
- Use Settings → Load Dictionary & Propagate Entities... to annotate from a dictionary file (format:
text tag
, e.g.,John Person
).
- Select multiple entities and click Merge Sel. to assign them the same ID.
- Right-click an annotated span and select Demerge This Instance to assign a new ID.
- Press
a
or go to Settings → Pre-annotate with AI... to tag entities using a pre-trained NER model. - Requires
transformers
andtorch
. Annotations are marked as propagated (underlined).
- Export annotations via File → Export for Training... in CoNLL or JSONL format.
- Import annotations via File → Import Annotations... from CoNLL or JSONL files, creating new
.txt
files.
- Enable Settings → Allow Multi-label & Overlapping Annotations to permit overlapping tags.
Annotations are stored in JSON format:
{
"file1.txt": {
"entities": [
{
"id": "a1b2c3...",
"start_line": 1,
"start_char": 10,
"end_line": 1,
"end_char": 20,
"text": "John Smith",
"tag": "Person",
"propagated": false
}
],
"relations": [
{
"id": "d4e5f6...",
"type": "works_at",
"head_id": "a1b2c3...",
"tail_id": "g7h8i9..."
}
]
}
}
Session files include additional metadata:
{
"version": "1.12",
"files_list": ["file1.txt", "file2.txt"],
"current_file_index": 0,
"entity_tags": ["Person", "Organization", ...],
"relation_types": ["spouse_of", "works_at", ...],
"tag_colors": {"Person": "#ffcccc", ...},
"annotations": {...},
"extend_to_word": true,
"allow_multilabel_overlap": true
}
- Hotkeys: Use 0-9 to select/relabel tags,
a
for AI pre-annotation, andDelete
to remove entities/relations. - Navigation: Click column headers to sort Entities/Relations lists; type a letter to jump to matching items.
- Workflow: Annotate entities first, then relations; propagate entities early to save time.
- Dictionary Format: Use one entity per line (e.g.,
New York Location
). - Double-click: Selects a word for quick annotation.
- Read-only Text: Ensures no accidental edits; use mouse or hotkeys for actions.
- AI Pre-annotation Fails: Install
transformers
andtorch
; ensure a file is loaded. - Missing Files: Session loading warns about missing files; continue with available ones.
- Overlap Issues: Enable multi-label support in Settings for overlapping annotations.
- Highlighting Issues: Switch files to refresh the display.
- Export Errors: Check write permissions and use
.conll
or.jsonl
extensions.
- 1.12 (2025):
- Added AI pre-annotation with
Babelscape/wikineural-multilingual-ner
. - Implemented multi-label and overlapping annotation support.
- Added demerge functionality via right-click.
- Made text area read-only to prevent accidental edits.
- Improved propagation with whitespace handling and underlining for propagated entities.
- Enhanced double-click/highlight annotation and single-click removal.
- Added import/export for CoNLL and spaCy JSONL formats.
- Added AI pre-annotation with
- 0.75: Double-click and highlight annotation, immutable text area.
- 0.70: Propagated entities flagged and underlined.
- 0.65: Entity search and sorting.
- 0.60: Session save/load for continuous work.
Kovács, T. (2025). ANNIE: Annotation Interface for Named-entity & Information Extraction (Version 1.12) [Computer software]. GitHub. https://github.com/kreeedit/ANNIE
@software{Kovacs_ANNIE_2025,
author = {Kovács, Tamás},
title = {{ANNIE: Annotation Interface for Named-entity & Information Extraction}},
version = {1.12},
publisher = {Zenodo},
year = {2025},
doi = {10.5281/zenodo.15805548},
url = {https://github.com/kreeedit/ANNIE}
}
Apache 2.0