Skip to content

NicoleTreichel/Benchmarking-shallow-Metagenomics

Repository files navigation

Benchmarking Shallow Metagenomics - UNDER CONSTRUCTION

This GitHub repository describes the workflow used for benchmarking shallow metagenomic sequencing of Mock communities (DNA mixtures), as described in Treichel et al. (bioRxiv).

Background

With this study we aimed to systematically assess the threshold of sequencing depth necessary for the read-outs of taxonomic analysis, functional genes and pathways, and MAG construction. We used two complex mixtures of DNA from cultured gut bacteria. An evenly distributed Mock community containing DNA of 70 strains and one with staggered distribution containing DNA of 24 strains. Analysis was done at nine sequencing depths (0.1, 0.25, 0.5, 0.75, 1.0, 1.5, 2.0, 5.0, and 10.0 Gb). Additionally, library preparation was performed in two facilities and the effect of background DNA was tested.

Description

Pre-processing

  1. Sub-sampling of shotgun metagenomic data to exact number of reads (seqtk)
  2. Quality filtering and phiX removal (trimmomatic, bbmap, bbduk)
  3. Assembly into Contigs (MegaHit)

Taxonomic Analysis

  1. Coverage of reads to reference genomes (coverM)
  2. Read count per reference genome / Relative abundance (coverM)

Functional Analysis

  1. Protein coding gene prediction (prodigal)
  2. Alignment to predicted protein sequences of reference genomes (Diamond)
  3. Completeness of functional pathways (kofamscan, KEGGdecoder)

Construction of metagenome-assembled genomes (MAGs)

  1. Removal of contigs < 1000 bp
  2. MAG construction (bowtie2, metabat2)
  3. Evaluation of completeness and contamination (checkM)
  4. Taxonomic assignment (GTDB-tk)
  5. MAG composition with respect to reference genomes (blastn)

Graphical overview

Workflow overview

Installation / Requirements

For installation of the required tools please visite their original websites linked above.

Data availability

Metagenomic data has be deposited at the European Nucleotide Archive/NCBI and is accessible under Project no. PRJEB83573.

Publication

Treichel et al. bioRxiv

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages