You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Merge pull request #68 from torres-alexis/Amp-B-config-typo
SW_AmpIllumina-B Fixes:
Add enable_visualizations to default config.yaml example
Fix run_workflow.py issue with single assay amplicon OSDR datasets
Account for different sizes of characters in dendrogram sample label scaling
Account for volcano plot x-axis labels that are too long to fit on plot
Stand-alone visualization script execution changes:
Add new visualizations/README.md for manual execution
Reference this README in Workflow Documentation's --visualizations description details
Add RColorBrewer palette variable to R script, include it in visualizations/README.md
Move envs/R_visualizations.yaml to visualizations/R_visualizations.yaml
Rename visualizations script variable "final_outputs_dir" to "plots_dir"
Copy file name to clipboardExpand all lines: Amplicon/Illumina/Workflow_Documentation/SW_AmpIllumina-B/README.md
+7-6Lines changed: 7 additions & 6 deletions
Original file line number
Diff line number
Diff line change
@@ -15,8 +15,8 @@ The current GeneLab Illumina amplicon sequencing data processing pipeline (AmpIl
15
15
-[3. Run the workflow using `run_workflow.py`](#3-run-the-workflow-using-run_workflowpy)
16
16
-[3a. Approach 1: Run the workflow on a GeneLab Amplicon (Illumina) sequencing dataset with automatic retrieval of raw read files and metadata](#3a-approach-1-run-the-workflow-on-a-genelab-amplicon-illumina-sequencing-dataset-with-automatic-retrieval-of-raw-read-files-and-metadata)
17
17
-[3b. Approach 2: Run the workflow on a non-OSD dataset using a user-created runsheet](#3b-approach-2-run-the-workflow-on-a-non-osd-dataset-using-a-user-created-runsheet)
*`--OSD OSD-###` - specifies the OSD dataset to process through the SW_AmpIllumina workflow (replace ### with the OSD number)
124
124
>*Used for Approach 1 only.*
@@ -168,10 +168,11 @@ ___
168
168
>*Optional parameter used in Approach 1 for datasets that have multiple assays for the same amplicon target (e.g. [OSD-249](https://osdr.nasa.gov/bio/repo/data/studies/OSD-249)).*
169
169
170
170
*`--visualizations TRUE/FALSE` - ifset to TRUE, the [visualizations script](workflow_code/visualizations/Illumina-R-visualizations.R) will be run. Default: FALSE
171
+
>*Note: For instructions on manually executing the visualizations script, refer to the [stand-alone execution documentation](./workflow_code/visualizations/README.md).*
171
172
172
173
<br>
173
174
174
-
**Parameter Definitionsfor`snakemake`**
175
+
**Parameter definitionsfor`snakemake`**
175
176
176
177
*`--use-conda` – specifies to use the conda environments included in the workflow (these are specified in the [envs](workflow_code/envs) directory)
177
178
*`--conda-prefix` – indicates where the needed conda environments will be stored. Adding this option will also allow the same conda environments to be re-used when processing additional datasets, rather than making new environments each time you run the workflow. The value listed for this option, `${CONDA_PREFIX}/envs`, points to the default location for conda environments (note: the variable `${CONDA_PREFIX}` will be expanded to the appropriate location on whichever system it is run on).
@@ -186,7 +187,7 @@ See `snakemake -h` and [Snakemake's documentation](https://snakemake.readthedocs
186
187
187
188
___
188
189
189
-
### 5. Additional Output Files
190
+
### 5. Additional output files
190
191
191
192
The outputs from the `run_workflow.py` and differential abundance analysis (DAA) / visualizations scripts are described below:
192
193
> Note: Outputs from the Amplicon Seq - Illumina pipeline are documented in the [GL-DPPD-7104-B.md](../../Pipeline_GL-DPPD-7104_Versions/GL-DPPD-7104-B.md) processing protocol.
Copy file name to clipboardExpand all lines: Amplicon/Illumina/Workflow_Documentation/SW_AmpIllumina-B/workflow_code/visualizations/Illumina-R-visualizations.R
# SW_AmpIllumina-B Visualization Script Information and Usage Instructions<!-- omit in toc -->
2
+
3
+
4
+
## General info <!-- omit in toc -->
5
+
The documentation for this script and its outputs can be found in [sections 6-10 of the GL-DPPD-7104-B.md processing protocol](/Amplicon/Illumina/Pipeline_GL-DPPD-7104_Versions/GL-DPPD-7104-B.md#6-amplicon-seq-data-analysis-set-up). This script is automatically executed as an optional step of the SW_AmpIllumina-B Snakemake workflow when the `run_workflow.py` argument `--visualizations TRUE` is used. Alternatively, the script can be executed manually as detailed below.
6
+
7
+
<br>
8
+
9
+
---
10
+
11
+
## Utilizing the script <!-- omit in toc -->
12
+
13
+
14
+
-[1. Set up the execution environment](#1-run-the-workflow-using-run_workflowpy)
15
+
-[2. Run the visualization script manually](#2-run-the-visualization-script-manually)
The script should be executed from a Conda environment created using the [R_visualizations.yaml](/Amplicon/Illumina/Workflow_Documentation/SW_AmpIllumina-B/workflow_code/visualizations/R_visualizations.yaml) environment file.
25
+
26
+
<br>
27
+
28
+
___
29
+
30
+
### 2. Run the visualization script manually
31
+
32
+
To run the script, the variables `runsheet_file`, `sample_info`, `counts`, `taxonomy`, `assay_suffix`, `plots_dir`, and `output_prefix` must be specified. The script can be manually executed via the command line by providing positional arguments.
33
+
34
+
Additionally, the `RColorBrewer_Palette` variable can be modified in the script. This variable determines the color palette from the RColorBrewer package that is applied to the plots.
**Parameter Definitions for Illumina-R-visualizations.R:**
64
+
*`runsheet_file` – specifies the runsheet containing sample metadata required for processing (output from [GL-DPPD-7104-B step 6a](/Amplicon/Illumina/Pipeline_GL-DPPD-7104_Versions/GL-DPPD-7104-B.md#6a-create-sample-runsheet))
65
+
*`sample_info` – specifies the text file containing the IDs of each sample used, required for running the SW_AmpIllumina workflow (output from [run_workflow.py](/Amplicon/Illumina/Workflow_Documentation/SW_AmpIllumina-B/README.md#5-additional-output-files))
66
+
*`counts` – specifies the ASV counts table (output from [GL-DPPD-7104-B step 5g](/Amplicon/Illumina/Pipeline_GL-DPPD-7104_Versions/GL-DPPD-7104-B.md#5g-generating-and-writing-standard-outputs))
67
+
*`taxonomy` – specifies the taxonomy table (output from [GL-DPPD-7104-B step 5g](/Amplicon/Illumina/Pipeline_GL-DPPD-7104_Versions/GL-DPPD-7104-B.md#5g-generating-and-writing-standard-outputs))
68
+
*`assay_suffix` – specifies a string that is prepended to the start of the output file names. Default: ""
69
+
*`plots_dir` – specifies the path where output files will be saved
70
+
*`output_prefix` – specifies a string that is appended to the end of the output file names. Default: "_GLAmpSeq"
71
+
*`RColorBrewer_Palette` – specifies the RColorBrewer palette that will be used for coloring in the plots. Options include "Set1", "Accent", "Dark2", "Paired", "Pastel1", "Pastel2", "Set2", and "Set3". Default: "Set1"
0 commit comments