Skip to content

AngieHinrichs/viral_usher

Repository files navigation

viral_usher

viral_usher is a command-line tool to set up and run a pipeline to build an UShER tree for a new viral species (or type, subtype, etc.) using genomes downloaded from NCBI.


🔧 Features

  • Subcommands:
    • init: Generate a config file (interactive or via command line options)
    • build: Download sequences and build a tree, guided by the config file
  • Uses Docker for portability to laptops, servers, or cloud platforms

📦 Installation

  1. Install prerequisites (if not already installed)
  1. Install with pip (again, we highly recommended using an environment manager):
    pip install viral_usher

🚀 Quickstart

Create a config file with viral_usher init

If you want to start by just naming a virus, and let viral_usher interactively help you identify the right reference sequence, Taxonomy ID etc., then simply run

viral_usher init

and reply to the prompts.

Alternatively, if you already know your parameters, then you can skip the interactive stuff by passing in command line options. Run viral_usher --help to get a listing of options. Here is an example that builds a tree for the Chikungunya virus using RefSeq NC_004162.2, all genomes available from GenBank for the Taxonomy ID associated with NC_004162.2 (Taxonomy ID 37124), plus additional sequences from example/hypothetical_chikungunya.fasta (in this repository):

git clone https://github.com/AngieHinrichs/viral_usher.git
cd viral_usher
viral_usher init \
    --refseq NC_004162.2 \
    --workdir chikungunya \
    --fasta example/hypothetical_chikungunya.fasta \
    --config chikungunya/config.toml

Build a tree using config file with viral_usher build:

Continuing the Chikungunya virus example:

viral_usher build --config chikungunya/config.toml

That's all! viral_usher will create the following files in workdir (chikungunya in our example):

  • a tree in UShER protobuf format (optimized.pb.gz)
  • a metadata file in TSV format (metadata.tsv.gz)
  • a Taxonium tree file that you can view using https://taxonium.org/ (tree.jsonl.gz)

To view the example Chikungunya virus tree in Taxonium, click here. Type or copy-paste "hypothetical" into Taxonium's Name search input to find the sequences from example/hypothetical_chikungunya.fasta.


🧪 Development

# Clone the repo
git clone https://github.com/AngieHinrichs/viral_usher.git
cd viral_usher

# Install dev dependencies
pip install -e .[dev]

# Run tests
pytest

About

Easily build an UShER tree of genomes for any virus in RefSeq/GenBank

Resources

License

Stars

Watchers

Forks

Packages

No packages published