Skip to content

Commit 3355181

Browse files
authored
RNASeq workflow 2.0.1 patch update
- Fixed qc_metrics generation - Added qc_metrics documentation
2 parents f3da4eb + bc388dd commit 3355181

File tree

9 files changed

+469
-26
lines changed

9 files changed

+469
-26
lines changed

RNAseq/Workflow_Documentation/NF_RCP/CHANGELOG.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,14 @@ All notable changes to this project will be documented in this file.
55
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
66
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
77

8+
## [2.0.1](https://github.com/nasa/GeneLab_Data_Processing/tree/NF_RCP_2.0.1/RNAseq/Workflow_Documentation/NF_RCP) - 2025-07-02
9+
10+
### Fixed
11+
12+
- Fixed fastqc metrics extraction in `parse_multiqc.py` script
13+
- Added qc file validation output listing missing entries
14+
- Updated multiqc parsing for fastqc metrics
15+
816
## [2.0.0](https://github.com/nasa/GeneLab_Data_Processing/tree/NF_RCP_2.0.0/RNAseq/Workflow_Documentation/NF_RCP) - 2025-04-10
917

1018
### Added

RNAseq/Workflow_Documentation/NF_RCP/QC_metrics_README.md

Lines changed: 213 additions & 0 deletions
Large diffs are not rendered by default.

RNAseq/Workflow_Documentation/NF_RCP/README.md

Lines changed: 15 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -128,9 +128,9 @@ All files required for utilizing the NF_RCP GeneLab workflow for processing RNAs
128128
copy of latest NF_RCP version on to your system, the code can be downloaded as a zip file from the release page then unzipped after downloading by running the following commands:
129129
130130
```bash
131-
wget https://github.com/nasa/GeneLab_Data_Processing/releases/download/NF_RCP_2.0.0/NF_RCP_2.0.0.zip
131+
wget https://github.com/nasa/GeneLab_Data_Processing/releases/download/NF_RCP_2.0.1/NF_RCP_2.0.1.zip
132132
133-
unzip NF_RCP_2.0.0.zip
133+
unzip NF_RCP_2.0.1.zip
134134
```
135135
136136
<br>
@@ -142,10 +142,10 @@ unzip NF_RCP_2.0.0.zip
142142
Although Nextflow can fetch Singularity images from a url, doing so may cause issues as detailed [here](https://github.com/nextflow-io/nextflow/issues/1210).
143143
144144
To avoid this issue, run the following command to fetch the Singularity images prior to running the NF_RCP workflow:
145-
> Note: This command should be run in the location containing the `NF_RCP_2.0.0` directory that was downloaded in [step 2](#2-download-the-workflow-files) above. Depending on your network speed, fetching the images will take ~20 minutes. Approximately 8GB of RAM is needed to download and build the Singularity images.
145+
> Note: This command should be run in the location containing the `NF_RCP_2.0.1` directory that was downloaded in [step 2](#2-download-the-workflow-files) above. Depending on your network speed, fetching the images will take ~20 minutes. Approximately 8GB of RAM is needed to download and build the Singularity images.
146146
147147
```bash
148-
bash NF_RCP_2.0.0/bin/prepull_singularity.sh NF_RCP_2.0.0/config/software/by_docker_image.config
148+
bash NF_RCP_2.0.1/bin/prepull_singularity.sh NF_RCP_2.0.1/config/software/by_docker_image.config
149149
```
150150
151151
@@ -161,7 +161,7 @@ export NXF_SINGULARITY_CACHEDIR=$(pwd)/singularity
161161
162162
### 4. Run the Workflow
163163
164-
While in the location containing the `NF_RCP_2.0.0` directory that was downloaded in [step 2](#2-download-the-workflow-files), you are now able to run the workflow.
164+
While in the location containing the `NF_RCP_2.0.1` directory that was downloaded in [step 2](#2-download-the-workflow-files), you are now able to run the workflow.
165165
166166
Both workflows automatically load reference files and organism-specific gene annotation files from the [GeneLab annotations table](https://github.com/nasa/GeneLab_Data_Processing/blob/master/GeneLab_Reference_Annotations/Pipeline_GL-DPPD-7110_Versions/GL-DPPD-7110-A/GL-DPPD-7110-A_annotations.csv). For organisms not listed in the table or to use alternative reference files, additional workflow parameters can be specified.
167167
@@ -175,7 +175,7 @@ Both workflows automatically load reference files and organism-specific gene ann
175175
#### 4a. Approach 1: Run the workflow on a GeneLab RNAseq dataset with automatic retrieval of reference fasta and gtf files
176176
177177
```bash
178-
nextflow run NF_RCP_2.0.0/main.nf \
178+
nextflow run NF_RCP_2.0.1/main.nf \
179179
-profile singularity,local \
180180
--accession OSD-194
181181
```
@@ -187,7 +187,7 @@ nextflow run NF_RCP_2.0.0/main.nf \
187187
#### 4b. Approach 2: Run the workflow on a GeneLab RNAseq dataset with custom reference fasta and gtf files
188188
189189
```bash
190-
nextflow run NF_RCP_2.0.0/main.nf \
190+
nextflow run NF_RCP_2.0.1/main.nf \
191191
-profile singularity,local \
192192
--accession OSD-194 \
193193
--reference_version 112 \
@@ -205,7 +205,7 @@ nextflow run NF_RCP_2.0.0/main.nf \
205205
#### 4c. Approach 3: Run the workflow on a non-GeneLab dataset using a user-created runsheet with automatic retrieval of reference fasta and gtf files
206206
207207
```bash
208-
nextflow run NF_RCP_2.0.0/main.nf \
208+
nextflow run NF_RCP_2.0.1/main.nf \
209209
-profile singularity,local \
210210
--runsheet_path </path/to/runsheet>
211211
```
@@ -217,7 +217,7 @@ nextflow run NF_RCP_2.0.0/main.nf \
217217
#### 4d. Approach 4: Run the workflow on a non-GeneLab dataset using a user-created runsheet with custom reference fasta and gtf files
218218
219219
```bash
220-
nextflow run NF_RCP_2.0.0/main.nf \
220+
nextflow run NF_RCP_2.0.1/main.nf \
221221
-profile singularity \
222222
--accession OSD-194 \
223223
--reference_version 112 \
@@ -235,7 +235,7 @@ nextflow run NF_RCP_2.0.0/main.nf \
235235
236236
#### Required Parameters For All Approaches:
237237
238-
* `NF_RCP_2.0.0/main.nf` - Instructs Nextflow to run the NF_RCP workflow
238+
* `NF_RCP_2.0.1/main.nf` - Instructs Nextflow to run the NF_RCP workflow
239239
240240
* `-profile` - Specifies the configuration profile(s) to load, `singularity` instructs Nextflow to setup and use singularity for all software called in the workflow; use `local` for local execution ([local.config](workflow_code/conf/local.config)) or `slurm` for SLURM cluster execution ([slurm.config](workflow_code/conf/slurm.config))
241241
> Note: The output directory will be named `GLDS-#` when using a OSD or GLDS accession as input, or `results` when running the workflow with only a runsheet as input.
@@ -313,7 +313,7 @@ nextflow run NF_RCP_2.0.0/main.nf \
313313
All parameters listed above and additional optional arguments for the RCP workflow, including debug related options that may not be immediately useful for most users, can be viewed by running the following command:
314314
315315
```bash
316-
nextflow run NF_RCP_2.0.0/main.nf --help
316+
nextflow run NF_RCP_2.0.1/main.nf --help
317317
```
318318
319319
See `nextflow run -h` and [Nextflow's CLI run command documentation](https://nextflow.io/docs/latest/cli.html#run) for more options and details common to all nextflow workflows.
@@ -354,6 +354,10 @@ The outputs from the Analysis Staging and V&V Pipeline Subworkflows are describe
354354
- processing_info/nextflow_log_GLbulkRNAseq.txt (Nextflow execution logs captured via `nextflow log`)
355355
- processing_info/nextflow_run_command_GLbulkRNAseq.txt (Exact command line used to initiate the workflow)
356356
357+
**QC metrics summary**
358+
359+
- Output:
360+
- GeneLab/qc_metrics_GLbulkRNAseq.csv (comma-separated text file containing a summary of qc metrics and metadata for the dataset, see the [QC metrics README](./QC_metrics_README.md) for a complete list of field definitions)
357361
<br>
358362
359363
Standard Nextflow resource usage logs are also produced as follows:

0 commit comments

Comments
 (0)