Enterovirus A71 Nextstrain Analysis

This repository provides a comprehensive Nextstrain analysis of Enterovirus A71. You can choose to perform either a VP1 run (>=600 base pairs) or a whole genome run (>=6400 base pairs).

For those unfamiliar with Nextstrain or needing installation guidance, please refer to the Nextstrain documentation.

Enhancing the Analysis

This analysis would benefit from additional metadata, such as patient age, spatial data, and clinical outcomes. If you have relevant data and are willing to share, please contact us.

The data for this analysis is available from NCBI Virus. Instructions for downloading sequences are provided under Sequences.

Repository Organization

This repository includes the following directories and files:

scripts: Custom Python scripts called by the snakefile.
snakefile: The entire computational pipeline, managed using Snakemake. Snakemake documentation can be found here.
ingest: Contains Python scripts and the snakefile for automatic downloading of EV-A71 sequences and metadata.
vp1: Sequences and configuration files for the VP1 run.
whole_genome: Sequences and configuration files for the whole genome run.

Configuration Files

The config, vp1/config, and whole_genome/config directories contain necessary configuration files:

colors.tsv: Color scheme
geo_regions.tsv: Geographical locations
lat_longs.tsv: Latitude data
dropped_strains.txt: Dropped strains
clades_genome.tsv: Virus clade assignments
reference_sequence.gb: Reference sequence
auspice_config.json: Auspice configuration file

The reference sequence used is BrCr, accession number U22521, sampled in 1970.

Quickstart

Setup

Nextstrain Environment

Install the Nextstrain environment by following these instructions.

Running a Build

Activate the Nextstrain environment:

micromamba activate nextstrain

To perform a build, run:

snakemake --cores 9 all

For specific builds:

VP1 build:

snakemake auspice/ev_a71_vp1.json --cores 9

Whole genome build:

snakemake auspice/ev_a71_whole-genome.json --cores 9

For tanglegrams, we can run the build on sub-alignments of the whole genome alignment. You can either run it for the specific genes or for the proteins P1, P2, P3.

gene build:

snakemake all_genes --cores 9

Whole genome build:

snakemake all_proteins --cores 9

Note

Version of augur: augur 27.0.0
Version of auspice: auspice 2.59.1

Ingest

For more information on how to run the ingest, please refer to the README in the ingest folder.

Visualizing the Build

To visualize the build, use Auspice:

auspice view --datasetDir auspice

To run two visualizations simultaneously, you may need to set the port:

export PORT=4001

Sequences

Sequences can be downloaded manually or automatically.

Manual Download: Visit NCBI Virus, search for EV-A71 or Taxid 39054, and download the sequences.
Automated Download: The ingest functionality, included in the main snakefile, handles automatic downloading.

The ingest pipeline is based on the Nextstrain RSV ingest workflow. Running the ingest pipeline produces data/metadata.tsv and data/sequences.fasta.

Feedback

For questions or comments, contact me via GitHub or nadia.neuner-jehle@swisstph.ch

To Do:

Overwrite NCBI virus metadata with "corrected" collection dates
Replace parse_date with augur curate
Provide a way to create and use "local" accession numbers for sequences not on Genbank yet.
Update symptom list
Get update_strain.sh to work

Name		Name	Last commit message	Last commit date
Latest commit History 90 Commits
config		config
ingest		ingest
scripts		scripts
vp1		vp1
whole_genome		whole_genome
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md
slurm.sh		slurm.sh
snakefile		snakefile

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Enterovirus A71 Nextstrain Analysis

Enhancing the Analysis

Repository Organization

Configuration Files

Quickstart

Setup

Nextstrain Environment

Running a Build

Ingest

Visualizing the Build

Sequences

Feedback

To Do:

Acknowledgments

About

Uh oh!

Languages

hodcroftlab/enterovirus_a71

Folders and files

Latest commit

History

Repository files navigation

Enterovirus A71 Nextstrain Analysis

Enhancing the Analysis

Repository Organization

Configuration Files

Quickstart

Setup

Nextstrain Environment

Running a Build

Ingest

Visualizing the Build

Sequences

Feedback

To Do:

Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Languages