Skip to content

Commit fa107a0

Browse files
Typo and formatting changes
1 parent c793bc1 commit fa107a0

File tree

1 file changed

+30
-19
lines changed

1 file changed

+30
-19
lines changed

Methyl-Seq/Pipeline_GL-DPPD-7113_Versions/GL-DPPD-7113.md

Lines changed: 30 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -120,7 +120,11 @@ fastqc -o raw_fastqc_output/ *raw.fastq.gz
120120
### 1b. Compile Raw Data QC
121121

122122
```bash
123-
multiqc --interactive -o raw_multiqc_GLmethylSeq_data/ -n raw_multiqc_GLmethylSeq -z raw_fastqc_output/
123+
multiqc --interactive \
124+
-o raw_multiqc_GLmethylSeq_data/ \
125+
-n raw_multiqc_GLmethylSeq \
126+
-z \
127+
raw_fastqc_output/
124128
```
125129

126130
**Parameter Definitions:**
@@ -348,7 +352,11 @@ fastqc -o trimmed_fastqc_output/ *trimmed.fastq.gz
348352
### 3b. Compile Trimmed Data QC
349353

350354
```bash
351-
multiqc --interactive -o trimmed_multiqc_GLmethylSeq_data/ -n trimmed_multiqc_GLmethylSeq -z trimmed_fastqc_output/ trimgalore_output/
355+
multiqc --interactive \
356+
-o trimmed_multiqc_GLmethylSeq_data/ \
357+
-n trimmed_multiqc_GLmethylSeq \
358+
-z \
359+
trimmed_fastqc_output/ trimgalore_output/
352360
```
353361

354362
**Parameter Definitions:**
@@ -358,7 +366,7 @@ multiqc --interactive -o trimmed_multiqc_GLmethylSeq_data/ -n trimmed_multiqc_GL
358366
* `-n` – the filename prefix for output files
359367
* `-z` – specifies to zip the output data directory
360368
* `trimmed_fastqc_output/` – the directory holding the output data from the fastqc run, provided as a positional argument
361-
* `trimgalore_output/` - the directory holding the trimgalore trimming reports.
369+
* `trimgalore_output/` - the directory holding the trimgalore trimming reports, provided as a positional argument
362370

363371
**Input data:**
364372

@@ -389,7 +397,8 @@ bismark_genome_preparation --bowtie2 \
389397
--parallel NumberOfThreads \
390398
bismark_reference_genome/
391399

392-
bam2nuc --genome_folder bismark_reference_genome/ --genomic_composition_only
400+
bam2nuc --genomic_composition_only \
401+
--genome_folder bismark_reference_genome/
393402
```
394403

395404
**Parameter Definitions:**
@@ -400,8 +409,10 @@ bam2nuc --genome_folder bismark_reference_genome/ --genomic_composition_only
400409
* positional argument specifing the directory holding the reference genome (should end in ".fa" or ".fasta", can be gzipped and including ".gz")
401410

402411
*bam2nuc*
403-
* --genome_folder - species the directory holding the reference genome (should end in ".fa" or ".fasta", can be gzipped and including ".gz")
404-
* --genomic_composition_only - species creation of the genomic_nucleotide_frequencies.txt report, which is genome rathe than sample specific.
412+
* --genomic_composition_only - specifies creation of the (genome-specific) genomic_nucleotide_frequencies.txt report
413+
* --genome_folder - specifies the directory holding the reference genome (should end in ".fa" or ".fasta", can be gzipped and including ".gz")
414+
415+
405416
**Input data:**
406417

407418
* a directory holding the reference genome in fasta format (this pipeline version uses the Ensembl fasta file indicated in the `fasta` column of the [GL-DPPD-7110_annotations.csv](../../GeneLab_Reference_Annotations/Pipeline_GL-DPPD-7110_Versions/GL-DPPD-7110/GL-DPPD-7110_annotations.csv) GeneLab Annotations file))
@@ -427,7 +438,7 @@ bam2nuc --genome_folder bismark_reference_genome/ --genomic_composition_only
427438
* BS_GA.rev.2.bt2
428439
* genome_mfa.GA_conversion.fa
429440
* \*.txt (captured standard output from the command)
430-
* bismark_reference_genome/genomic_nucleotide_frequencies.txt
441+
* bismark_reference_genome/genomic_nucleotide_frequencies.txt (tab-delimited table of mono- and di-nucleotide frequencies in reference genome)
431442

432443

433444

@@ -501,7 +512,7 @@ mv sample-1_R1_trimmed_bismark_bt2_pe.bam sample-1_bismark_bt2_pe.bam
501512

502513
* sample-1_bismark_{bt2,bt2_pe,hisat2,hisat2_pe}.bam (mapping file)
503514
* **\*_[SP]E_report.txt** (bismark mapping report file)
504-
* **\*.nucleotide_stats.txt** (tab-delimited table with sample-specific mono- and di-nucleotide sequence compositions and coverage values compared to genomic compositions
515+
* **\*.nucleotide_stats.txt** (tab-delimited table with sample-specific mono- and di-nucleotide sequence compositions and coverage values compared to genomic compositions)
505516

506517

507518
> **NOTE**
@@ -598,7 +609,7 @@ deduplicate_bismark sample-1_bismark_{bt2,bt2_pe}.bam
598609
599610
**Output data:**
600611

601-
* \*.deduplicated.bam (unsorted bismark bowtie2 alignment bam file, with duplicates removed)
612+
* **\*.deduplicated.bam** (unsorted bismark bowtie2 alignment bam file, with duplicates removed)
602613
* **\*.deduplication_report.txt** (report file containing deduplication information)
603614

604615

@@ -608,7 +619,7 @@ deduplicate_bismark sample-1_bismark_{bt2,bt2_pe}.bam
608619

609620
```bash
610621
samtools sort -@ NumberOfThreads \
611-
-o sample-1_bismark_bt2_sorted.deduplicated.bam \
622+
-o sample-1_bismark_{bt2,bt2_pe}_sorted.deduplicated.bam \
612623
sample-1_bismark_{bt2,bt2_pe}.deduplicated.bam
613624
```
614625

@@ -627,7 +638,7 @@ samtools sort -@ NumberOfThreads \
627638
628639
**Output data:**
629640

630-
* **sample-1_bismark_bt2_sorted.deduplicated.bam** (bismark bowtie2 alignment bam file sorted by chromosomal coordinates, with duplicated removed)
641+
* **sample-1_bismark_{bt2,bt2_pe}_sorted.deduplicated.bam** (bismark bowtie2 alignment bam file sorted by chromosomal coordinates, with duplicates removed)
631642

632643
<br>
633644

@@ -649,7 +660,7 @@ bismark_methylation_extractor --parallel NumberOfThreads \
649660
sample-1_bismark_bt2.bam
650661
# note, if *not working with RRBS data, input should be the deduplicated
651662
# version (sample-1_bismark_bt2*.deduplicated.bam) produced in
652-
# step 6 above
663+
# step 6a above
653664
```
654665

655666
**Paired-end example**
@@ -667,7 +678,7 @@ bismark_methylation_extractor --parallel NumberOfThreads \
667678
sample-1_bismark_bt2_pe.bam
668679
# note, if *not working with RRBS data, input should be the deduplicated
669680
# version (sample-1_bismark_bt2*.deduplicated.bam) produced in
670-
# step 6 above
681+
# [Step 6a.](#6a-deduplicate) above
671682
```
672683

673684

@@ -733,7 +744,7 @@ bismark2report --dir sample-1_bismark_report_out_dir/ \
733744
> If using RNA, files will include "bismark_hisat2" instead of "bismark_bt2" in the name.
734745
735746
> **NOTE**
736-
> If data are **not** RRBS, the deduplication report from [step 6](#6-deduplicate-skip-if-data-are-rrbs) above should also be provided to the above command to the `--dedup_report` parameter
747+
> If data are **not** RRBS, the deduplication report from [Step 6a.](#6a-deduplicate) above should also be provided to the above command to the `--dedup_report` parameter
737748
738749
**Output data:**
739750

@@ -758,7 +769,7 @@ bismark2summary sample-1_bismark_{bt2,bt2_pe}.bam
758769
* the autodetected files cannot be explicitly provided, but it looks for those named like these listed here and includes them if they exist for each individual starting bam file it is given or finds
759770
* sample-1_bismark_bt2_[SP]E_report.txt generated from [Step 4b.](#4b-align) above
760771
* sample-1_bismark_{bt2,bt2_pe}_splitting_report.txt from [Step 7](#7-extract-methylation-calls) above
761-
* sample-1_bismark_{bt2,bt2_pe}.deduplication_report.txt if deduplication was performed in [Step 6](#6-deduplicate-skip-if-data-are-rrbs)
772+
* sample-1_bismark_{bt2,bt2_pe}.deduplication_report.txt if deduplication was performed in [Step 6a.](#6a-deduplicate)
762773
> **NOTE**
763774
> If using RNA, files will include "bismark_hisat2" instead of "bismark_bt2" in the name.
764775
@@ -774,7 +785,7 @@ bismark2summary sample-1_bismark_{bt2,bt2_pe}.bam
774785
### 9b. Compile Alignment and Bismark QC
775786

776787
```bash
777-
multiqc --interactive -o align_multiqc_data/ -n align_multiqc -z \
788+
multiqc --interactive -o align_and_bismark_multiqc_data/ -n align_and_bismark_multiqc -z \
778789
qualimap_out_dir/ mapping_files_out_dir/ methylation_calls_out_dir/ deduplication_out_dir/
779790
```
780791

@@ -786,7 +797,7 @@ multiqc --interactive -o align_multiqc_data/ -n align_multiqc -z \
786797
* `-z` – specifies to zip the output data directory
787798
* `qualimap_out_dir/` – the directory holding the output data from the qualimap run, provided as a positional argument
788799
* `methylation_calls_out_dir/` – the directory holding the output data from the methylation extraction run, provided as a positional argument
789-
* `mapping_files_out_dir/` – the directory holding the output data from the fastqc run, provided as a positional argument
800+
* `mapping_files_out_dir/` – the directory holding the output data from the alignment run, provided as a positional argument
790801
* `deduplication_out_dir/` – the directory holding the output data from the deduplication run, provided as a positional argument (omitted if RRBS data)
791802

792803
**Input data:**
@@ -800,8 +811,8 @@ multiqc --interactive -o align_multiqc_data/ -n align_multiqc -z \
800811

801812
**Output data:**
802813

803-
* **align_multiqc_GLmethylSeq.html** (multiqc output html summary)
804-
* **align_multiqc_GLmethylSeq_data** (directory containing multiqc output data)
814+
* **align_and_bismark_multiqc_GLmethylSeq.html** (multiqc output html summary)
815+
* **align_and_bismark_multiqc_GLmethylSeq_data** (directory containing multiqc output data)
805816

806817
<br>
807818

0 commit comments

Comments
 (0)