Gene-level differential expression assessment from bulk RNA-seq data with DESeq2, EdgeR, and Voom.
Step 1: Configure workflow Set the input read file directory, the reference sequence, etc. in the config.yaml file.
Step 2: Provide sample annotation Provide annotation of which sample corresponds to which treatment in the samples.tsv file.
Step 3: Setup your shell envinroment Provide optional configuration of shell in config.sh (e.g. "module load <...>" or "export PATH=<...>" or "source activate ")
Step 4: Test setup Test configuration in dry-run: snakemake -n / snakemake --use-conda -n
Step 5: Execute the workflow locally via
snakemake --use-conda --cores $ncpu using $ncpu cores, or run it on a cluster via
runSnakemake.sh (with ressource requirement configuration in cluster.json)
- Read data QC with fastqc
- Trimming with Trimmomatic
- Alignment to reference with STAR
- Mapping QC with qualimap and picard
- Differential expression assessment with DESeq2, EdgeR, Voom
- QC report generation with Multiqc
- genome: ftp.ensembl.org/pub/release-90/fasta/homo_sapiens/dna/Homo_sapiens.GRCh38.dna_sm.chromosome.21.fa.gz
- annotation: ftp.ensembl.org/pub/release-90/gtf/homo_sapiens/Homo_sapiens.GRCh38.90.gtf.gz
- reads: sampled from https://www.ebi.ac.uk/arrayexpress/files/E-GEUV-1/HG00096.1.M_111124_6.bam
To install:
- star
- multiqc
- fastqc
- trimmomatic
- qualimap
- picard
Via anaconda with --use-conda:
- deseq2
- biocparallel
- edger
- limma