Skip to content

Commit 86814bd

Browse files
committed
[GL_RefAnnotTable] switch from apptainer to singularity
1 parent 12b0587 commit 86814bd

File tree

2 files changed

+71
-131
lines changed

2 files changed

+71
-131
lines changed
Lines changed: 69 additions & 129 deletions
Original file line numberDiff line numberDiff line change
@@ -1,112 +1,96 @@
11
# GL_RefAnnotTable-A Workflow Information and Usage Instructions <!-- omit in toc -->
22

33
## Table of Contents <!-- omit in toc -->
4-
- [General Workflow Info](#general-workflow-info)
4+
5+
- [General Workflow Information](#general-workflow-information)
56
- [Utilizing the Workflow](#utilizing-the-workflow)
6-
- [Approach 1: Using Apptainer](#approach-1-using-apptainer)
7-
- [1. Install Apptainer](#1-install-apptainer)
8-
- [2. Download the Workflow Files](#2-download-the-workflow-files)
9-
- [3. Fetch Apptainer Image](#3-fetch-apptainer-image)
10-
- [4. Run the Workflow](#4-run-the-workflow)
11-
- [5. Run the Annotations Database Creation Function as a Stand-Alone Script](#5-run-the-annotations-database-creation-function-as-a-stand-alone-script)
12-
- [Approach 2: Using a Local R Environment](#approach-2-using-a-local-r-environment)
13-
- [1. Install R and Required R Packages](#1-install-r-and-required-r-packages)
14-
- [2. Download the Workflow Files](#2-download-the-workflow-files-1)
15-
- [3. Set Execution Permissions for Workflow Scripts](#3-set-execution-permissions-for-workflow-scripts)
16-
- [4. Run the Workflow](#4-run-the-workflow-1)
17-
- [5. Run the Annotations Database Creation Function as a Stand-Alone Script](#5-run-the-annotations-database-creation-function-as-a-stand-alone-script-1)
18-
19-
<br>
7+
- [1. Download the Workflow Files](#1-download-the-workflow-files)
8+
- [2. Run the Workflow](#2-run-the-workflow)
9+
- [Approach 1: Using Singularity](#approach-1-using-singularity)
10+
- [Step 1: Install Singularity](#step-1-install-singularity)
11+
- [Step 2: Fetch the Singularity Image](#step-2-fetch-the-singularity-image)
12+
- [Step 3: Run the Workflow](#step-3-run-the-workflow)
13+
- [Step 4: Run the Annotations Database Creation Function as a Stand-Alone Script](#step-4-run-the-annotations-database-creation-function-as-a-stand-alone-script)
14+
- [Approach 2: Using a Local R Environment](#approach-2-using-a-local-r-environment)
15+
- [Step 1: Install R and Required R Packages](#step-1-install-r-and-required-r-packages)
16+
- [Step 2: Run the Workflow](#step-2-run-the-workflow)
17+
- [Step 3: Run the Annotations Database Creation Function as a Stand-Alone Script](#step-3-run-the-annotations-database-creation-function-as-a-stand-alone-script)
2018

2119
---
2220

23-
## General Workflow Info
24-
25-
The current GeneLab Reference Annotation Table (GL_RefAnnotTable-A) pipeline is implemented as an R workflow that can be run from a command line interface (CLI) using bash. The workflow can be executed using either a Apptainer (formerly Singularity) container or a local R environment. The workflow can be used even if you are unfamiliar with R, but if you want to learn more about R, visit the [R-project about page here](https://www.r-project.org/about.html). Additionally, an introduction to R along with installation help and information about using R for bioinformatics can be found [here at Happy Belly Bioinformatics](https://astrobiomike.github.io/R/basics).
21+
## General Workflow Information
2622

27-
<br>
23+
The current GeneLab Reference Annotation Table (GL_RefAnnotTable-A) pipeline is implemented as an R workflow that can be run from a command line interface (CLI) using bash. The workflow can be executed using either a Singularity container or a local R environment. The workflow can be used even if you are unfamiliar with R, but if you want to learn more about R, visit the [R-project about page here](https://www.r-project.org/about.html). Additionally, an introduction to R along with installation help and information about using R for bioinformatics can be found [here at Happy Belly Bioinformatics](https://astrobiomike.github.io/R/basics).
2824

2925
---
3026

3127
## Utilizing the Workflow
3228

33-
The GL_RefAnnotTable-A workflow can be run using two approaches:
34-
35-
1. **[Using Apptainer](#approach-1-using-apptainer)**.
29+
To utilize the GL_RefAnnotTable-A workflow, follow the instructions below to download the necessary workflow files. Once downloaded, the workflow can be executed using two approaches:
3630

37-
2. **[Using a local R environment](#approach-2-using-a-local-r-environment)**.
31+
1. **[Using Singularity](#approach-1-using-singularity)**
32+
2. **[Using a Local R Environment](#approach-2-using-a-local-r-environment)**
3833

39-
Please follow the instructions for the approach that best matches your setup and preferences. Each method is explained in the sections below.
40-
41-
<br>
34+
Please follow the instructions for the approach that best matches your setup and preferences. Each method is explained in detail below.
4235

4336
---
4437

45-
### Approach 1: Using Apptainer
38+
### 1. Download the Workflow Files
4639

47-
This approach allows you to run the workflow within a containerized environment, ensuring consistency and reproducibility.
40+
Download the latest version of the GL_RefAnnotTable-A workflow:
4841

49-
<br>
42+
```bash
43+
curl -LO https://github.com/nasa/GeneLab_Data_Processing/releases/download/GL_RefAnnotTable-A_1.1.0/GL_RefAnnotTable-A_1.1.0.zip
44+
unzip GL_RefAnnotTable-A_1.1.0.zip
45+
```
5046

5147
---
5248

53-
#### 1. Install Apptainer
54-
55-
Apptainer can be installed either through [Anaconda](https://anaconda.org/conda-forge/singularity) or as documented on the [Apptainer documentation page](https://apptainer.org/docs/admin/main/installation.html).
49+
### 2. Run the Workflow
5650

57-
> **Note**: If you prefer to use Anaconda, we recommend installing Miniconda for your system, as instructed by [Happy Belly Bioinformatics](https://astrobiomike.github.io/unix/conda-intro#getting-and-installing-conda).
58-
>
59-
> Once conda is installed on your system, you can install Apptainer by running:
60-
>
61-
> ```bash
62-
> conda install -c conda-forge apptainer
63-
> ```
51+
The GL_RefAnnotTable-A workflow can be run using two approaches:
6452

65-
<br>
53+
- **[Approach 1: Using Singularity](#approach-1-using-singularity)**
54+
- **[Approach 2: Using a Local R Environment](#approach-2-using-a-local-r-environment)**
6655

6756
---
6857

69-
#### 2. Download the Workflow Files
58+
#### Approach 1: Using Singularity
7059

71-
Download the latest version of the GL_RefAnnotTable-A workflow:
60+
This approach allows you to run the workflow within a containerized environment, ensuring consistency and reproducibility.
7261

73-
```bash
74-
curl -LO https://github.com/nasa/GeneLab_Data_Processing/releases/download/GL_RefAnnotTable-A_1.1.0/GL_RefAnnotTable-A_1.1.0.zip
75-
unzip GL_RefAnnotTable-A_1.1.0.zip
76-
cd GL_RefAnnotTable-A_1.1.0
77-
```
62+
##### Step 1: Install Singularity
7863

79-
<br>
64+
Singularity is a containerization platform for running applications portably and reproducibly. We use container images hosted on Quay.io to encapsulate all the necessary software and dependencies required by the GL_RefAnnotTable-A workflow. This setup allows you to run the workflow without installing any software directly on your system. Other containerization tools like Docker or Apptainer can also be used to pull and run these images.
8065

81-
---
66+
We recommend installing Singularity system-wide as per the official [Singularity installation documentation](https://docs.sylabs.io/guides/3.10/admin-guide/admin_quickstart.html).
8267

83-
#### 3. Fetch Apptainer Image
68+
> **Note**: While Singularity is also available through [Anaconda](https://anaconda.org/conda-forge/singularity), we recommend installing Singularity system-wide following the official installation documentation.
8469
85-
To fetch the Apptainer image needed for the workflow, run:
70+
##### Step 2: Fetch the Singularity Image
8671

87-
```bash
88-
bash bin/prepull_apptainer.sh config/software/by_docker_image.config
89-
```
90-
> Note: This command should be run in the directory containing the GL_RefAnnotTable-A_1.1.0 folder downloaded in [step 2](#2-download-the-workflow-files). Depending on your network speed, this may take approximately 20 minutes.
72+
To pull the Singularity image needed for the workflow, you can use the provided script as directed below or pull the image directly.
9173

92-
Once complete, an apptainer folder containing the Apptainer image will be created. Export this folder as an Apptainer configuration environment variable:
74+
> **Note**: This command should be run in the location containing the `GL_RefAnnotTable-A_1.1.0` directory that was downloaded in [step 1](#1-download-the-workflow-files). Depending on your network speed, fetching the images will take approximately 20 minutes.
9375
9476
```bash
95-
export APPTAINER_CACHEDIR=$(pwd)/apptainer
77+
bash GL_RefAnnotTable-A_1.1.0/bin/prepull_singularity.sh GL_RefAnnotTable-A_1.1.0/config/software/by_docker_image.config
9678
```
9779

98-
<br>
80+
Once complete, a `singularity` folder containing the Singularity images will be created. Run the following command to export this folder as an environment variable:
9981

100-
---
82+
```bash
83+
export SINGULARITY_CACHEDIR=$(pwd)/singularity
84+
```
10185

102-
#### 4. Run the Workflow
86+
##### Step 3: Run the Workflow
10387

104-
While in the `GL_RefAnnotTable-A_1.1.0` directory, you can now run the workflow. Below is an example for generating an annotation table for Mus musculus (mouse):
88+
While in the directory containing the `GL_RefAnnotTable-A_1.1.0` folder, you can now run the workflow. Below is an example for generating the annotation table for *Mus musculus* (mouse):
10589

10690
```bash
107-
apptainer exec -B $(pwd):/work \
108-
$APPTAINER_CACHEDIR/quay.io-nasa_genelab-gl-refannottable-a-1.1.0.img \
109-
bash -c "cd /work && Rscript GL-DPPD-7110-A_build-genome-annots-tab.R 'Mus musculus'"
91+
singularity exec -B $(pwd)/GL_RefAnnotTable-A_1.1.0:/work \
92+
$SINGULARITY_CACHEDIR/quay.io-nasa_genelab-gl-refannottable-a-1.1.0.sif \
93+
Rscript /work/GL-DPPD-7110-A_build-genome-annots-tab.R 'Mus musculus'
11094
```
11195

11296
**Input data:**
@@ -119,18 +103,14 @@ bash -c "cd /work && Rscript GL-DPPD-7110-A_build-genome-annots-tab.R 'Mus muscu
119103
- *-GL-annotations.tsv (Tab delineated table of gene annotations)
120104
- *-GL-build-info.txt (Text file containing information used to create the annotation table, including tool and tool versions and date of creation)
121105

122-
<br>
123-
124-
---
106+
##### Step 4: Run the Annotations Database Creation Function as a Stand-Alone Script
125107

126-
#### 5. Run the Annotations Database Creation Function as a Stand-Alone Script
127-
128-
If the reference table does not specify an annotations database for the target organism in the annotations column, the `install_annotations` function (defined in `install-org-db.R`) will be executed. This function can also be run as a stand-alone script:
108+
If the reference table does not specify an annotations database for the target organism in the 'annotations' column, the `install_annotations` function (defined in `install-org-db.R`) will be executed. This function can also be run as a stand-alone script:
129109

130110
```bash
131-
apptainer exec -B $(pwd):/work \
132-
$APPTAINER_CACHEDIR/quay.io-nasa_genelab-gl-refannottable-a-1.1.0.img \
133-
bash -c "cd /work && Rscript install-org-db.R 'Bacillus subtilis'"
111+
singularity exec -B $(pwd)/GL_RefAnnotTable-A_1.1.0:/work \
112+
$SINGULARITY_CACHEDIR/quay.io-nasa_genelab-gl-refannottable-a-1.1.0.sif \
113+
Rscript /work/install-org-db.R 'Bacillus subtilis'
134114
```
135115

136116
**Input data:**
@@ -142,39 +122,33 @@ apptainer exec -B $(pwd):/work \
142122

143123
- org.*.eg.db/ (Species-specific annotation database, as a local R package)
144124

145-
<br>
146-
147125
---
148126

149-
### Approach 2: Using a Local R Environment
127+
#### Approach 2: Using a Local R Environment
150128

151-
This approach allows you to run the workflow directly in your local R environment without using Apptainer containers.
129+
This approach allows you to run the workflow directly in your local R environment without using containers.
152130

153-
<br>
154-
155-
---
156-
157-
#### 1. Install R and Required R Packages
131+
##### Step 1: Install R and Required R Packages
158132

159133
We recommend installing R via the [Comprehensive R Archive Network (CRAN)](https://cran.r-project.org/):
160134

161-
1. Select the [CRAN Mirror](https://cran.r-project.org/mirrors.html) closest to your location.
162-
2. Navigate to the download page for your operating system.
163-
3. Download and install R (e.g., R-4.4.0).
135+
1. Select the [CRAN Mirror](https://cran.r-project.org/mirrors.html) closest to your location.
136+
2. Navigate to the download page for your operating system.
137+
3. Download and install R (e.g., R-4.4.0).
138+
139+
Once R is installed, you need to install the required R packages.
164140

165-
Once R is installed, open a terminal and start R:
141+
Open a terminal and start R:
166142

167143
```bash
168144
R
169145
```
170146

171-
Within an active R environment, run the following commands to install the required R packages:
147+
Within the R environment, run the following commands to install the required packages:
172148

173149
```R
174150
install.packages("tidyverse")
175-
176151
install.packages("BiocManager")
177-
178152
BiocManager::install("STRINGdb")
179153
BiocManager::install("PANTHER.db")
180154
BiocManager::install("rtracklayer")
@@ -183,42 +157,12 @@ BiocManager::install("biomaRt")
183157
BiocManager::install("GO.db")
184158
```
185159

186-
<br>
187-
188-
---
160+
##### Step 2: Run the Workflow
189161

190-
#### 2. Download the Workflow Files
191-
192-
All files required for utilizing the GL_RefAnnotTable-A workflow for generating reference annotation tables are in the [workflow_code](workflow_code) directory. To get a copy of latest GL_RefAnnotTable version on to your system, run the following command:
162+
While in the directory containing the `GL_RefAnnotTable-A_1.1.0` folder, you can now run the workflow. Below is an example of how to run the workflow to build an annotation table for *Mus musculus* (mouse):
193163

194164
```bash
195-
curl -LO https://github.com/nasa/GeneLab_Data_Processing/releases/download/GL_RefAnnotTable-A_1.1.0/GL_RefAnnotTable-A_1.1.0.zip
196-
```
197-
198-
<br>
199-
200-
---
201-
202-
#### 3. Set Execution Permissions for Workflow Scripts
203-
204-
Once you've downloaded the GL_RefAnnotTable-A workflow directory as a zip file, unzip the workflow then `cd` into the GL_RefAnnotTable-A_1.1.0 directory on the CLI. Next, run the following command to set the execution permissions for the R script:
205-
206-
```bash
207-
unzip GL_RefAnnotTable-A_1.1.0.zip
208-
cd GL_RefAnnotTable-A_1.1.0
209-
chmod -R u+x *R
210-
```
211-
212-
<br>
213-
214-
---
215-
216-
#### 4. Run the Workflow
217-
218-
While in the GL_RefAnnotTable workflow directory, you are now able to run the workflow. Below is an example of how to run the workflow to build an annotation table for Mus musculus (mouse):
219-
220-
```bash
221-
Rscript GL-DPPD-7110-A_build-genome-annots-tab.R 'Mus musculus'
165+
Rscript GL_RefAnnotTable-A_1.1.0/GL-DPPD-7110-A_build-genome-annots-tab.R 'Mus musculus'
222166
```
223167

224168
**Input data:**
@@ -231,16 +175,12 @@ Rscript GL-DPPD-7110-A_build-genome-annots-tab.R 'Mus musculus'
231175
- *-GL-annotations.tsv (Tab delineated table of gene annotations)
232176
- *-GL-build-info.txt (Text file containing information used to create the annotation table, including tool and tool versions and date of creation)
233177

234-
<br>
235-
236-
---
237-
238-
#### 5. Run the Annotations Database Creation Function as a Stand-Alone Script
178+
##### Step 3: Run the Annotations Database Creation Function as a Stand-Alone Script
239179

240180
If the reference table does not specify an annotations database for the target organism in the 'annotations' column, the `install_annotations` function (defined in `install-org-db.R`) will be executed. This function can also be run as a stand-alone script:
241181

242182
```bash
243-
Rscript install-org-db.R 'Bacillus subtilis'
183+
Rscript GL_RefAnnotTable-A_1.1.0/install-org-db.R 'Bacillus subtilis'
244184
```
245185

246186
**Input data:**
@@ -252,4 +192,4 @@ Rscript install-org-db.R 'Bacillus subtilis'
252192

253193
- org.*.eg.db/ (species-specific annotation database, as a local R package)
254194

255-
<br>
195+
---
Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44
# Addresses issue: https://github.com/nextflow-io/nextflow/issues/1210
55

66
CONFILE=${1:-nextflow.config}
7-
OUTDIR=${2:-./apptainer}
7+
OUTDIR=${2:-./singularity}
88

99
if [ ! -e $CONFILE ]; then
1010
echo "$CONFILE does not exist"
@@ -26,7 +26,7 @@ while IFS= read -r line; do
2626
name=${name/:/-}
2727
name=${name//\//-}
2828
echo $name
29-
apptainer pull ${name}.img docker://$line
29+
singulairty pull ${name}.img docker://$line
3030
done < $TMPFILE
3131

3232
cd $CURDIR

0 commit comments

Comments
 (0)