Skip to content

Commit 499538d

Browse files
Formatting updates
1 parent bd917b4 commit 499538d

File tree

1 file changed

+34
-20
lines changed
  • GeneLab_Reference_Annotations/Workflow_Documentation/GL_RefAnnotTable-A

1 file changed

+34
-20
lines changed

GeneLab_Reference_Annotations/Workflow_Documentation/GL_RefAnnotTable-A/README.md

Lines changed: 34 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -10,11 +10,11 @@
1010
- [Step 1: Install Singularity](#step-1-install-singularity)
1111
- [Step 2: Fetch the Singularity Image](#step-2-fetch-the-singularity-image)
1212
- [Step 3: Run the Workflow](#step-3-run-the-workflow)
13-
- [Step 4: Run the Annotations Database Creation Function as a Stand-Alone Script](#step-4-run-the-annotations-database-creation-function-as-a-stand-alone-script)
13+
- [Optional: Run the Annotations Database Creation Function as a Stand-Alone Script](#optional-run-the-annotations-database-creation-function-as-a-stand-alone-script)
1414
- [Approach 2: Using a Local R Environment](#approach-2-using-a-local-r-environment)
1515
- [Step 1: Install R and Required R Packages](#step-1-install-r-and-required-r-packages)
1616
- [Step 2: Run the Workflow](#step-2-run-the-workflow)
17-
- [Step 3: Run the Annotations Database Creation Function as a Stand-Alone Script](#step-3-run-the-annotations-database-creation-function-as-a-stand-alone-script)
17+
- [Optional: Run the Annotations Database Creation Function as a Stand-Alone Script](#optional-run-the-annotations-database-creation-function-as-a-stand-alone-script)
1818

1919
<br>
2020

@@ -52,35 +52,42 @@ The GL_RefAnnotTable-A workflow can be run using one of two approaches:
5252

5353
Please follow the instructions for the approach that best matches your setup and preferences. Each method is explained in detail below.
5454

55+
<br>
56+
5557
---
5658

5759
### Approach 1: Using Singularity
5860

5961
This approach allows you to run the workflow within a containerized environment, ensuring consistency and reproducibility.
6062

63+
<br>
64+
6165
#### Step 1: Install Singularity
6266

6367
Singularity is a containerization platform for running applications portably and reproducibly. We use container images hosted on Quay.io to encapsulate all the necessary software and dependencies required by the GL_RefAnnotTable-A workflow. This setup allows you to run the workflow without installing any software directly on your system.
68+
6469
> ***Note**: Other containerization tools like Docker or Apptainer can also be used to pull and run these images.*
70+
6571

66-
We recommend installing Singularity system-wide as per the official [Singularity installation documentation](https://docs.sylabs.io/guides/3.10/admin-guide/admin_quickstart.html).
72+
We recommend installing Singularity system-wide as per the official [Singularity installation documentation](https://docs.sylabs.io/guides/3.10/admin-guide/admin_quickstart.html).
73+
6774

6875
> ***Note**: While Singularity is also available through [Anaconda](https://anaconda.org/conda-forge/singularity), we recommend installing Singularity system-wide following the official installation documentation.*
6976
7077
<br>
7178

7279
#### Step 2: Fetch the Singularity Image
7380

74-
To pull the Singularity image needed for the workflow, you can use the provided script as directed below or pull the image directly.
81+
To pull the Singularity image needed for the workflow, you can use the provided script as directed below or pull the image directly.
7582

76-
> ***Note**: This command should be run in the location containing the `GL_RefAnnotTable-A_1.1.0` directory that was downloaded in [step 1](#1-download-the-workflow-files). Depending on your network speed, fetching the images will take approximately 20 minutes.*
83+
> ***Note**: This command should be run in the location containing the `GL_RefAnnotTable-A_1.1.0` directory that was downloaded in [step 1](#1-download-the-workflow-files). Depending on your network speed, fetching the images will take approximately 20 minutes.*
7784
7885

7986
```bash
8087
bash GL_RefAnnotTable-A_1.1.0/bin/prepull_singularity.sh GL_RefAnnotTable-A_1.1.0/config/software/by_docker_image.config
8188
```
8289

83-
Once complete, a `singularity` folder containing the Singularity images will be created. Run the following command to export this folder as an environment variable:
90+
Once complete, a `singularity` folder containing the Singularity images will be created. Run the following command to export this folder as an environment variable:
8491

8592

8693
```bash
@@ -98,13 +105,14 @@ singularity exec -B $(pwd)/GL_RefAnnotTable-A_1.1.0:/work \
98105
$SINGULARITY_CACHEDIR/quay.io-nasa_genelab-gl-refannottable-a-1.1.0.img \
99106
Rscript /work/GL-DPPD-7110-A_build-genome-annots-tab.R 'Mus musculus'
100107
```
101-
108+
<br>
109+
102110
**Input data:**
103111

104112
- No input files are required. Specify the species name of the target organism using a positional command line argument. `Mus musculus` is used in the example above.
105-
> **Notes**:
106-
> To see a list of all available organisms, run `Rscript GL-DPPD-7110-A_build-genome-annots-tab.R` without positional arguments.
107-
> The correct argument for each organism can also be found in the 'species' column of the [GL-DPPD-7110-A_annotations.csv](../../Pipeline_GL-DPPD-7110_Versions/GL-DPPD-7110-A/GL-DPPD-7110-A_annotations.csv)
113+
> **Notes**:
114+
> - To see a list of all available organisms, run `Rscript GL-DPPD-7110-A_build-genome-annots-tab.R` without positional arguments.
115+
> - The correct argument for each organism can also be found in the 'species' column of the [GL-DPPD-7110-A_annotations.csv](../../Pipeline_GL-DPPD-7110_Versions/GL-DPPD-7110-A/GL-DPPD-7110-A_annotations.csv)
108116
- *Optional*: A local reference table CSV file can be supplied as a second positional argument. If not provided, the script will download the current version of the [GL-DPPD-7110-A_annotations.csv](../../Pipeline_GL-DPPD-7110_Versions/GL-DPPD-7110-A/GL-DPPD-7110-A_annotations.csv) table by default.
109117

110118

@@ -117,7 +125,7 @@ Rscript /work/GL-DPPD-7110-A_build-genome-annots-tab.R 'Mus musculus'
117125

118126
#### *Optional*: Run the Annotations Database Creation Function as a Stand-Alone Script
119127

120-
If the reference table does not specify an annotations database for the target organism in the 'annotations' column of the [GL-DPPD-7110-A_annotations.csv](../../Pipeline_GL-DPPD-7110_Versions/GL-DPPD-7110-A/GL-DPPD-7110-A_annotations.csv) file, the `install_annotations` function (defined in `install-org-db.R`) will be executed by default. This function can also be run as a stand-alone script:
128+
If the reference table does not specify an annotations database for the target organism in the 'annotations' column of the [GL-DPPD-7110-A_annotations.csv](../../Pipeline_GL-DPPD-7110_Versions/GL-DPPD-7110-A/GL-DPPD-7110-A_annotations.csv) file, the `install_annotations` function (defined in `install-org-db.R`) will be executed by default. This function can also be run as a stand-alone script:
121129

122130

123131
```bash
@@ -126,11 +134,12 @@ $SINGULARITY_CACHEDIR/quay.io-nasa_genelab-gl-refannottable-a-1.1.0.img \
126134
Rscript /work/install-org-db.R 'Bacillus subtilis'
127135
```
128136

137+
<br>
129138

130139
**Input data:**
131140

132-
- The species name of the target organism must be specified as the first positional command line argument. `Bacillus subtilis` is used in the example above.
133-
> **Note**: The correct argument for each organism can also be found in the 'species' column of the [GL-DPPD-7110-A_annotations.csv](../../Pipeline_GL-DPPD-7110_Versions/GL-DPPD-7110-A/GL-DPPD-7110-A_annotations.csv)
141+
- The species name of the target organism must be specified as the first positional command line argument. `Bacillus subtilis` is used in the example above.
142+
> **Note**: The correct argument for each organism can also be found in the 'species' column of the [GL-DPPD-7110-A_annotations.csv](../../Pipeline_GL-DPPD-7110_Versions/GL-DPPD-7110-A/GL-DPPD-7110-A_annotations.csv)
134143
- *Optional*: A local reference table CSV file can be supplied as a second positional argument. If not provided, the script will download the current version of the [GL-DPPD-7110-A_annotations.csv](../../Pipeline_GL-DPPD-7110_Versions/GL-DPPD-7110-A/GL-DPPD-7110-A_annotations.csv) table by default.
135144

136145

@@ -146,6 +155,8 @@ Rscript /work/install-org-db.R 'Bacillus subtilis'
146155

147156
This approach allows you to run the workflow directly in your local R environment without using containers.
148157

158+
<br>
159+
149160
#### Step 1: Install R and Required R Packages
150161

151162
We recommend installing R via the [Comprehensive R Archive Network (CRAN)](https://cran.r-project.org/):
@@ -184,20 +195,21 @@ BiocManager::install("GO.db")
184195

185196
#### Step 2: Run the Workflow
186197

187-
While in the directory containing the `GL_RefAnnotTable-A_1.1.0` folder, you can now run the workflow. Below is an example of how to run the workflow to build an annotation table for *Mus musculus* (mouse):
198+
While in the directory containing the `GL_RefAnnotTable-A_1.1.0` folder, you can now run the workflow. Below is an example of how to run the workflow to build an annotation table for *Mus musculus* (mouse):
188199

189200

190201
```bash
191202
Rscript GL_RefAnnotTable-A_1.1.0/GL-DPPD-7110-A_build-genome-annots-tab.R 'Mus musculus'
192203
```
193-
204+
205+
<br>
194206

195207
**Input data:**
196208

197209
- No input files are required. Specify the species name of the target organism using a positional command line argument. `Mus musculus` is used in the example above.
198-
> **Notes**:
199-
> To see a list of all available organisms, run `Rscript GL-DPPD-7110-A_build-genome-annots-tab.R` without positional arguments.
200-
> The correct argument for each organism can also be found in the 'species' column of the [GL-DPPD-7110-A_annotations.csv](../../Pipeline_GL-DPPD-7110_Versions/GL-DPPD-7110-A/GL-DPPD-7110-A_annotations.csv)
210+
> **Notes**:
211+
> - To see a list of all available organisms, run `Rscript GL-DPPD-7110-A_build-genome-annots-tab.R` without positional arguments.
212+
> - The correct argument for each organism can also be found in the 'species' column of the [GL-DPPD-7110-A_annotations.csv](../../Pipeline_GL-DPPD-7110_Versions/GL-DPPD-7110-A/GL-DPPD-7110-A_annotations.csv)
201213
- *Optional*: A local reference table CSV file can be supplied as a second positional argument. If not provided, the script will download the current version of the [GL-DPPD-7110-A_annotations.csv](../../Pipeline_GL-DPPD-7110_Versions/GL-DPPD-7110-A/GL-DPPD-7110-A_annotations.csv) table by default.
202214

203215

@@ -210,17 +222,19 @@ Rscript GL_RefAnnotTable-A_1.1.0/GL-DPPD-7110-A_build-genome-annots-tab.R 'Mus m
210222

211223
#### *Optional*: Run the Annotations Database Creation Function as a Stand-Alone Script
212224

213-
If the reference table does not specify an annotations database for the target organism in the 'annotations' column of the [GL-DPPD-7110-A_annotations.csv](../../Pipeline_GL-DPPD-7110_Versions/GL-DPPD-7110-A/GL-DPPD-7110-A_annotations.csv) file, the `install_annotations` function (defined in `install-org-db.R`) will be executed by default. This function can also be run as a stand-alone script:
225+
If the reference table does not specify an annotations database for the target organism in the 'annotations' column of the [GL-DPPD-7110-A_annotations.csv](../../Pipeline_GL-DPPD-7110_Versions/GL-DPPD-7110-A/GL-DPPD-7110-A_annotations.csv) file, the `install_annotations` function (defined in `install-org-db.R`) will be executed by default. This function can also be run as a stand-alone script:
214226

215227

216228
```bash
217229
Rscript GL_RefAnnotTable-A_1.1.0/install-org-db.R 'Bacillus subtilis'
218230
```
219231

232+
<br>
233+
220234
**Input data:**
221235

222236
- The species name of the target organism must be specified as the first positional command line argument. `Bacillus subtilis` is used in the example above.
223-
> **Note**: The correct argument for each organism can also be found in the 'species' column of the [GL-DPPD-7110-A_annotations.csv](../../Pipeline_GL-DPPD-7110_Versions/GL-DPPD-7110-A/GL-DPPD-7110-A_annotations.csv)
237+
> **Note**: The correct argument for each organism can also be found in the 'species' column of the [GL-DPPD-7110-A_annotations.csv](../../Pipeline_GL-DPPD-7110_Versions/GL-DPPD-7110-A/GL-DPPD-7110-A_annotations.csv)
224238
- *Optional*: A local reference table CSV file can be supplied as a second positional argument. If not provided, the script will download the current version of the [GL-DPPD-7110-A_annotations.csv](../../Pipeline_GL-DPPD-7110_Versions/GL-DPPD-7110-A/GL-DPPD-7110-A_annotations.csv) table by default.
225239

226240

0 commit comments

Comments
 (0)