Skip to content

sys-bio/Antotate

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Antotate

Antotate is a Python-based tool that automatically appends standardized metabolite annotations to Antimony-formatted biochemical models. By linking each species to curated identifiers from biochemical databases, Antotate improves model clarity, interoperability, and reuse in systems and synthetic biology workflows.

Purpose

Metabolic models often include non-standardized species names that can cause ambiguity or hinder integration with visualization and simulation tools. Antotate resolves this by:

  • Mapping species to display names and unique identifiers (e.g., KEGG compound IDs),
  • Appending this information directly to the Antimony model file,
  • Generating a confidence score for each annotation,
  • Supporting multiple biochemical databases.

This tool is ideal for researchers preparing models for publication, exchange, or downstream analysis with tools like SBMLNetwork or Tellurium.


Requirements

Antotate is written in Python (≥3.9) and uses standard libraries. To run:

  1. Python version 3.9 or later
  2. Jupyter Notebook (if running interactively)

To install required Python packages:

pip install -r requirements.txt

How to Run

You can run Antotate either within Python or directly from the command line.

Option 1: Within Python

from antotate import Annotate

annotator = Annotate()
annotator.annotate('cell_free.txt', databases='kegg')

Option 2: From Terminal

python antotate.py 'cell_free.txt' --databases kegg

You can specify one or more databases:

  • kegg
  • bigg.metabolite
  • chebi
  • hmdb
  • metacyc.compound

Outputs

Antotate produces two outputs:

  1. Annotated Antimony file (e.g., cell_free_kegg.txt)

    • Appends display names and identifiers beneath the original reaction network.
    • Format example:
      Serine is "SER";
      Serine identity "http://identifiers.org/kegg/C00716";
      
  2. Confidence metrics CSV (e.g., confidence_metrics.csv)

    • Summarizes the mapping for each species.
    • Includes the original name, assigned display name, matched identifier, and a confidence score (0, low – 1, high).

Repository Contents

  • antotate.py – Main script for annotation.
  • requirements.txt – Required dependencies.
  • Example files:
    • cell_free.txt – Example Antimony model.
    • cell_free_kegg.txt – Output with appended annotations.
    • confidence_metrics.xlsx – Annotation summary. This is originally created as a CSV, and that file was altered to include an additional tab to show the manual changes we've made to correct the automated annotations.

Example

Input Antimony line:

R406 : 1 Serine + hEC43117 -> hEC43117 + 1 NH3 + 1 Pyr;

Appended by Antotate:

Serine is "SER";
Serine identity "http://identifiers.org/kegg/C00716";
NH3 is "AMMONIA";
NH3 identity "http://identifiers.org/kegg/C01342";
Pyr is "PYRUVATE";
Pyr identity "http://identifiers.org/kegg/C00022";

Tips & Best Practices

  • Start with meaningful species names to improve annotation accuracy.
  • Review the confidence_metrics.csv file to verify mappings.
  • Use multiple databases for broader coverage.

About

Antotate is a tool allowing for the automatic annotation of Antimony files.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%