-
Notifications
You must be signed in to change notification settings - Fork 7
Open
Labels
Description
We have a subset of 60 samples from Karthikeyan et al. that show a transition between delta and omicron variants.
To run it live in a workshop we need to have the pipeline finish in under 1h.
If that is not possible, we may need to instead use a preprocessed
folder with results from 60 samples and in the workshop they only process 5-10 samples.
- test how long it takes to run 30 samples - locally, this takes over 2h and didn't finish (we killed the process)
- test running 5 samples only with default options (see comment below).
- test using
--freyja_repeats 0
with viralrecon on a small number of samples. To see if it's possible to skip this step. If this throws error use 1 and see if that works. This is to save time running the pipeline. - Find primer locations for the kit used in the publication: Swift Normalase Amplicon Panels (SNAP) kit (PN: SN-5X296 (core) COVG1V2-96 (amplicon primers), Integrated DNA Technologies)
- Run 3 samples using the SWIFT BED file directly using the
--primer_bed
option. Also download the reference FASTA file and GFF and pass directly with--fasta
and--gff
. - prepare participant data directories with the files needed for the workshop. There is a folder on the hpc under
sars-wastewater/participants
for this. This includes: data/reads
- FASTQ files for the 5 samples to be processedresources
- reference genome FASTA and GFF annotation (may be useful for some analysis)preprocessed
- with results from 30 samplesscripts
- shell scripts that they will fix in the exercisesutilities
- python scripts we provide, e.g. to prepare samplesheet or tidy freyja output filessample_info.csv
metadata table with "sample,date,country,location,latitude,longitude"