This is the code repository for bjorn_utils - a suite of miscellaneous tools that can be used to:
- prepare results and data files from SARS-CoV-2 sequencing analysis for release to public databases such as GISAID, Google Cloud, and GitHub
- Install Anaconda: instructions can be found here
- Create the
bjornenvironment
conda env create -f environment.yml -n bjorn_utils- Activate environment
conda activate bjorn_utils- Install datafunk (inside the activated environment): instructions (ensure environment is activated during installation)
Current stable branch is "main", use all below instructions on that branch.
- Activate
bjornenvironment
conda activate bjorn_utils- Open
run_alab_release.shto specify your parameters such as- filepath to sample sheet containing sample metadata (input)
- filepath to updated metadata of samples that have already been uploaded
- output directory where results are saved
- number of CPU cores available for use
- minimum coverage required for each sample (QC filter)
- minimum average depth required for each sample (QC filter)
- sequencing technology used
- DEFAULT: test parameters
- Open
config.jsonto specify your parameters such as- list of SARS-CoV-2 genes that are considered non-concerning
- i.e. the occurrence of open-read frame (ORF) altering mutations can be accepted
- e.g. ['ORF8', 'ORF10']
- list of SARS-CoV-2 mutations that are considered non-concerning
- i.e. the occurrence of
ORF8:Q27_can be accepted (B117 exists) - e.g. ['ORF8:Q27_']
- i.e. the occurrence of
- list of SARS-CoV-2 genes that are considered non-concerning
- Run the
run_alab_release.shscript to initiate the data release pipeline
bash run_alab_release.sh