Skip to content

Prepare data for wastewater materials #43

@tavareshugo

Description

@tavareshugo

We have a subset of 60 samples from Karthikeyan et al. that show a transition between delta and omicron variants.

To run it live in a workshop we need to have the pipeline finish in under 1h.
If that is not possible, we may need to instead use a preprocessed folder with results from 60 samples and in the workshop they only process 5-10 samples.

  • test how long it takes to run 30 samples - locally, this takes over 2h and didn't finish (we killed the process)
  • test running 5 samples only with default options (see comment below).
  • test using --freyja_repeats 0 with viralrecon on a small number of samples. To see if it's possible to skip this step. If this throws error use 1 and see if that works. This is to save time running the pipeline.
  • Find primer locations for the kit used in the publication: Swift Normalase Amplicon Panels (SNAP) kit (PN: SN-5X296 (core) COVG1V2-96 (amplicon primers), Integrated DNA Technologies)
    • Hugo contacted idtdna who now commercialise this product.
    • Bajuna will open issue on C-VIEW repo to ask what they did in the publication.
  • Run 3 samples using the SWIFT BED file directly using the --primer_bed option. Also download the reference FASTA file and GFF and pass directly with --fasta and --gff.
  • prepare participant data directories with the files needed for the workshop. There is a folder on the hpc under sars-wastewater/participants for this. This includes:
  • data/reads - FASTQ files for the 5 samples to be processed
  • resources - reference genome FASTA and GFF annotation (may be useful for some analysis)
  • preprocessed - with results from 30 samples
  • scripts - shell scripts that they will fix in the exercises
  • utilities - python scripts we provide, e.g. to prepare samplesheet or tidy freyja output files
  • sample_info.csv metadata table with "sample,date,country,location,latitude,longitude"

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions