About deepTools

Why we built deepTools
How we use deepTools
Tools overview

Why we built deepTools

no were programs available
QC, normalization, visualization
highly customizable images (change colours, size, labels, file format etc.)
enable very individual down-stream analyses - prerequisite: have access to every data set that is being produced
modular approach - compatibility, flexibility, scalability

How we use deepTools

The majority of samples that we handle within our facility come from ChIP-seq experiments, therefore you will find many examples from ChIP-seq analyses. This does not mean that deepTools is restricted to ChIP-seq data analysis, but some tools, such as bamFingerprint specifically address ChIP-seq-issues. (That being said, we do process quite a bit of RNA-seq, other -seq and genomic sequencing data using deepTools, too.)

Here are slides that we used for teaching at the University of Freiburg.

As depicted in the figure down below, our work usually begins with one or more FASTQ file(s) of deeply-sequenced samples. After a first quality control using FASTQC, we align the reads to the reference genome, e.g. using bowtie2. We then use deepTools to assess the quality of the aligned reads:

Correlation between BAM files (bamCorrelate). This is a very basic test to see whether the sequenced and aligned reads meet your expectations. We use this check to assess the reproducibility - either between replicates and/or between different experiments that might have used the same antibody/the same cell type etc. For instance, replicates should correlate better than differently treated samples.
GC bias check (computeGCbias). Many sequencing protocols require several rounds of PCR-based amplification of the DNA to be sequenced. Unfortunately, most DNA polymerases used for PCR introduce significant GC biases as they prefer to amplify GC-rich templates. Depending on the sample (preparation), the GC bias can vary significantly and we routinely check its extent. In case we need to compare files with different GC biases, we use the correctGCbias module to match the GC bias. See the paper by [Benjamini and Speed][] for many insights into this problem.
Assessing the ChIP strength. This is a QC we do to get a feeling for the signal-to-noise ratio in samples from ChIP-seq experiments. It is based on the insights published by [Diaz et al.][].

Once we're satisfied by the basic quality checks, we normally convert the large BAM files into a leaner data format, typically bigWig. bigWig files have several advantages over BAM files that mainly stem from their significantly decreased size:

useful for data sharing & storage
intuitive visualization in Genome Browsers (e.g. UCSC Genome Browser, IGV)
more efficient downstream analyses are possible

The deepTools modules bamCompare and bamCoverage do not only allow the simple conversion from BAM to bigWig (or bedGraph for that matter), the main reason why we developed those tools was that we wanted to be able to normalize the read coverages so that we could compare different samples despite differences in sequencing depth, GC biases and so on.

Finally, once all the files have passed our visual inspections, the fun of downstream analyses with heatmapper and profiler can begin!

Here's a visual summary of our average workflow - deepTools modules are indicated in bold letters, alternative software such as FASTQC and bowtie are noted in regular font. Everything written in red is related to quality control (QC) of the samples.

flowChartI

deepTools overview

deepTools consists of a set of modules that can be used independently to work with mapped reads. We have subdivided such tasks into quality controls, normalizations and visualizations.

Here's a concise summary of the tools - if you would like more detailed information about the individual tools and example figures, follow the links in the table.

tool	type	input files	main output file(s)	application
[bamCorrelate][]	QC	2 or more BAM	clustered heatmap	Pearson or Spearman correlation between read distributions
[bamFingerprint][]	QC	2 BAM	1 diagnostic plot	assess enrichment strength of a ChIP sample
[computeGCBias][]	QC	1 BAM	2 diagnostic plots	calculate the exp. and obs. GC distribution of reads
[bamCoverage][]	normalization	BAM	bedGraph or bigWig	obtain the normalized read coverage of a single BAM file
[bamCompare][]	normalization	2 BAM	bedGraph or bigWig	normalize 2 BAM files to each other using a mathematical operation of your choice (e.g. log2ratio, difference)
[computeMatrix][]	visualization	1 bigWig, 1 BED	zipped file, to be used with heatmapper or profiler	compute the values needed for heatmaps and summary plots
[heatmapper][]	visualization	computeMatrix output	heatmap of read coverages	visualize the read coverages for genomic regions
[profiler][]	visualization	computeMatrix output	summary plot ("meta-profile")	visualize the average read coverages over a group of genomic regions

deepTools is developed by the Bioinformatics Facility at the Max Planck Institute for Immunobiology and Epigenetics, Freiburg. For troubleshooting, see our FAQ and get in touch: deeptools@googlegroups.com

WIKI-START > About deepTools

FAQ || Glossary

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

About deepTools

Why we built deepTools

How we use deepTools

deepTools overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally