You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: RNAseq/Workflow_Documentation/NF_RCP/README.md
+25-28Lines changed: 25 additions & 28 deletions
Original file line number
Diff line number
Diff line change
@@ -144,23 +144,23 @@ While in the location containing the `NF_RCP-G_2.0.0` directory that was downloa
144
144
```bash
145
145
nextflow run NF_RCP-G_2.0.0/main.nf \
146
146
-profile singularity \
147
-
--gldsAccession OSD-194
147
+
--accession OSD-194
148
148
```
149
149
150
150
<br>
151
151
152
152
#### 4b. Approach 2: Run the workflow on a GeneLab RNAseq dataset using local reference fasta and gtf files
153
153
154
-
> Note: The `--ref_source` and `--ensemblVersion` parameters should match the reference source and version number of the local reference fasta and gtf files used
154
+
> Note: The `--reference_source` and `--reference_version` parameters should match the reference source and version number of the local reference fasta and gtf files used
155
155
156
156
```bash
157
157
nextflow run NF_RCP-G_2.0.0/main.nf \
158
158
-profile singularity \
159
-
--gldsAccession OSD-194 \
160
-
--ensemblVersion 107 \
161
-
--ref_source ensembl \
162
-
--ref_fasta </path/to/fasta> \
163
-
--ref_gtf </path/to/gtf>
159
+
--accession OSD-194 \
160
+
--reference_version 107 \
161
+
--reference_source ensembl \
162
+
--reference_fasta </path/to/fasta> \
163
+
--reference_gtf </path/to/gtf>
164
164
```
165
165
166
166
<br>
@@ -172,8 +172,8 @@ nextflow run NF_RCP-G_2.0.0/main.nf \
172
172
```bash
173
173
nextflow run NF_RCP-G_2.0.0/main.nf \
174
174
-profile singularity \
175
-
--gldsAccession output_directory \
176
-
--runsheetPath </path/to/runsheet>
175
+
--accession output_directory \
176
+
--runsheet_path </path/to/runsheet>
177
177
```
178
178
179
179
<br>
@@ -184,43 +184,39 @@ nextflow run NF_RCP-G_2.0.0/main.nf \
184
184
185
185
* `-profile` - Specifies the configuration profile(s) to load, `singularity` instructs Nextflow to setup and use singularity for all software called in the workflow
186
186
187
-
* `--gldsAccession OSD-###` – specifies the OSD dataset to process through the RCP workflow (replace ### with the OSD number)
188
-
> Note: The primary output directory will be titled "OSD-###"
189
-
190
-
* `--gldsAccession output_directory` – specifies the output directory name to use when processing a non-OSD dataset, as indicated in [Approach 3 above](#4c-approach-3-run-the-workflow-on-a-non-glds-dataset-using-a-user-created-runsheet)
187
+
* `--accession [OSD-###|GLDS-###]` – specifies the OSDR dataset to process through the RCP workflow (replace ### with the OSD or GLDS number)
188
+
> Note: The primary output directory will be named after the accession input, e.g. "OSD-194" or "GLDS-194"
191
189
192
190
193
191
<br>
194
192
195
193
**Additional Required Parameters For [Approach 2](#4b-approach-2-run-the-workflow-on-a-genelab-rnaseq-dataset-using-local-ensembl-reference-fasta-and-gtf-files):**
196
194
197
-
* `--ensemblVersion` - specifies the Ensembl version to use for the reference genome (Ensembl release `107` is used in this example)
195
+
* `--reference_version` - specifies the Ensembl version to use for the reference genome (Ensembl release `107` is used in this example)
198
196
199
-
* `--ref_source` - specifies the source of the reference files used (the source indicated in the Approach 2 example is `ensembl`)
197
+
* `--reference_source` - specifies the source of the reference files used (the source indicated in the Approach 2 example is `ensembl`)
200
198
201
-
* `--ref_fasta` - specifices the path to a local fasta file
199
+
* `--reference_fasta` - specifices the path to a local fasta file
202
200
203
-
* `--ref_gtf` - specifices the path to a local gtf file
201
+
* `--reference_gtf` - specifices the path to a local gtf file
204
202
205
-
> Note: If the local reference files specified are different than the Ensembl reference files used to create the [GeneLab annotations table](https://github.com/nasa/GeneLab_Data_Processing/blob/master/GeneLab_Reference_Annotations/Pipeline_GL-DPPD-7110_Versions/GL-DPPD-7110/GL-DPPD-7110_annotations.csv), additional gene annotations associated with any Ensembl/TAIR IDs from the specified files that are not shared in the GeneLab annotations will not be added to the DGE output table(s).
203
+
> Note: If the local reference files specified are different than the reference files used to create the [GeneLab annotations table](https://github.com/nasa/GeneLab_Data_Processing/blob/master/GeneLab_Reference_Annotations/Pipeline_GL-DPPD-7110_Versions/GL-DPPD-7110/GL-DPPD-7110_annotations.csv), additional gene annotations associated with any gene IDs from the specified files that are not shared in the GeneLab annotations will not be added to the DGE output table(s).
206
204
207
205
<br>
208
206
209
207
**Optional Parameters:**
210
208
211
-
* `--skipVV` - skip the automated V&V processes (Default: the automated V&V processes are active)
209
+
* `--skip_vv` - skip the automated V&V processes (Default: the automated V&V processes are active)
212
210
213
-
* `--outputDir` - specifies the directory to save the raw and processed data files (Default: files are saved in the launch directory)
211
+
* `--outdir` - specifies the directory to save the raw and processed data files (Default: files are saved in a folder named `results` created in the launch directory)
214
212
215
213
* `--force_single_end` - forces the analysis to use single end processing; for paired end datasets, this means only R1 is used; for single end datasets, this should have no effect
216
214
217
-
* `--stageLocal TRUE|FALSE` - TRUE = download the raw reads files for the OSD dataset indicated, FALSE = disable raw reads download and processing (Default: TRUE)
218
-
219
-
* `--referenceStorePath` - specifies the directory to store the Ensembl fasta and gtf files (Default: within the directory structure created by default in the launch directory)
215
+
* `--reference_store_path` - specifies the directory to store the Ensembl fasta and gtf files (Default: within the directory structure created by default in the launch directory)
220
216
221
-
* `--derivedStorePath` - specifies the directory to store the tool-specific indices created during processing (Default: within the directory structure created by default in the launch directory)
217
+
* `--derived_store_path` - specifies the directory to store the tool-specific indices created during processing (Default: within the directory structure created by default in the launch directory)
222
218
223
-
* `--runsheetPath` - specifies the path to a local runsheet (Default: a runsheet is automatically generated using the metadata on the GeneLab Repository for the OSD dataset being processed)
219
+
* `--runsheet_path` - specifies the path to a local runsheet (Default: a runsheet is automatically generated using the metadata on the GeneLab Repository for the OSD dataset being processed)
224
220
> This is required when prcessing a non-OSD dataset as indicated in [Approach 3 above](#4c-approach-3-run-the-workflow-on-a-non-glds-dataset-using-a-user-created-runsheet)
225
221
226
222
<br>
@@ -272,8 +268,9 @@ Standard Nextflow resource usage logs are also produced as follows:
272
268
**Nextflow Resource Usage Logs**
273
269
274
270
- Output:
275
-
- Resource_Usage/execution_report_{timestamp}.html (an html report that includes metrics about the workflow execution including computational resources and exact workflow process commands)
276
-
- Resource_Usage/execution_timeline_{timestamp}.html (an html timeline forall processes executedin the workflow)
277
-
- Resource_Usage/execution_trace_{timestamp}.txt (an execution tracing file that contains information about each process executed in the workflow, including: submission time, start time, completion time, cpu and memory used, machine-readable output)
271
+
- nextflow_logs/execution_report_{timestamp}.html (an html report that includes metrics about the workflow execution including computational resources and exact workflow process commands)
272
+
- nextflow_logs/execution_timeline_{timestamp}.html (an html timeline forall processes executedin the workflow)
273
+
- nextflow_logs/execution_trace_{timestamp}.txt (an execution tracing file that contains information about each process executed in the workflow, including: submission time, start time, completion time, cpu and memory used, machine-readable output)
274
+
- nextflow_info/pipeline_dag_{timestamp}.html (a visualization of the workflow process DAG)
0 commit comments