Skip to content

Use relative paths for inputs to make the training portable #438

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Show file tree
Hide file tree
Changes from 6 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -8,3 +8,4 @@ work
nf-training/results
transcript-index
.vscode
results_genomics
4 changes: 3 additions & 1 deletion docs/hello_nextflow/03_hello_containers.md
Original file line number Diff line number Diff line change
Expand Up @@ -115,9 +115,11 @@ One way to do this is to **mount** a **volume** from the host system into the co
Prior to working on the next task, confirm that you are in the `hello-nextflow` directory.

```bash
cd /workspace/gitpod/hello-nextflow
pwd
```

This should show `/workspaces/training/hello-nextflow`. The important point is the `hello-nextflow` is the final path.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
This should show `/workspaces/training/hello-nextflow`. The important point is the `hello-nextflow` is the final path.
This should show `/workspaces/hello-nextflow`. The important point is the `hello-nextflow` is the final path.


Then run:

```bash
Expand Down
24 changes: 13 additions & 11 deletions docs/hello_nextflow/04_hello_genomics.md
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,7 @@ The tools we need (Samtools and GATK) are not installed in the Gitpod environmen
!!! note

Make sure you're in the correct working directory:
`cd /workspace/gitpod/hello-nextflow`
`pwd` should return a path ending in `hello-nextflow`

### 0.1. Index a BAM input file with Samtools

Expand Down Expand Up @@ -561,9 +561,9 @@ This error will not reproduce consistently because it is dependent on some varia
This is what the output of the two `.view` calls we added looks like for a failed run:

```console title="Output"
/workspace/gitpod/hello-nextflow/data/bam/reads_mother.bam
/workspace/gitpod/hello-nextflow/data/bam/reads_father.bam
/workspace/gitpod/hello-nextflow/data/bam/reads_son.bam
./data/bam/reads_mother.bam
./data/bam/reads_father.bam
./data/bam/reads_son.bam
/workspace/gitpod/hello-nextflow/work/9c/53492e3518447b75363e1cd951be4b/reads_father.bam.bai
/workspace/gitpod/hello-nextflow/work/cc/37894fffdf6cc84c3b0b47f9b536b7/reads_son.bam.bai
/workspace/gitpod/hello-nextflow/work/4d/dff681a3d137ba7d9866e3d9307bd0/reads_mother.bam.bai
Expand Down Expand Up @@ -717,9 +717,9 @@ Here we are going to show you how to do the simple case.
We already made a text file listing the input file paths, called `sample_bams.txt`, which you can find in the `data/` directory.

```txt title="sample_bams.txt"
/workspace/gitpod/hello-nextflow/data/bam/reads_mother.bam
/workspace/gitpod/hello-nextflow/data/bam/reads_father.bam
/workspace/gitpod/hello-nextflow/data/bam/reads_son.bam
/data/bam/reads_mother.bam
/data/bam/reads_father.bam
/data/bam/reads_son.bam
Comment on lines +721 to +723
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/data/bam/reads_mother.bam
/data/bam/reads_father.bam
/data/bam/reads_son.bam
./data/bam/reads_mother.bam
./data/bam/reads_father.bam
./data/bam/reads_son.bam

```

As you can see, we listed one file path per line, and they are absolute paths.
Expand Down Expand Up @@ -757,7 +757,7 @@ This way we can continue to be lazy, but the list of files no longer lives in th
Currently, our input channel factory treats any files we give it as the data inputs we want to feed to the indexing process.
Since we're now giving it a file that lists input file paths, we need to change its behavior to parse the file and treat the file paths it contains as the data inputs.

Fortunately we can do that very simply, just by adding the [`.splitText()` operator](https://www.nextflow.io/docs/latest/reference/operator.html#operator-splittext) to the channel construction step.
We are going to use the [`.splitCsv()`](https://www.nextflow.io/docs/latest/operator.html#operator-splitcsv) operator to parse the file into lines, and then use `.map()` to convert each line into a file path object. This introduces some advanced concepts that we'll explain in more detail later in this training series, but for now it's enough to understand that we can manipulate the contents of the samplesheet after we read it in but before we use it.

_Before:_

Expand All @@ -768,9 +768,11 @@ reads_ch = Channel.fromPath(params.reads_bam)

_After:_

```groovy title="hello-genomics.nf" linenums="68"
````groovy title="hello-genomics.nf" linenums="68"
// Create input channel from a text file listing input file paths
reads_ch = Channel.fromPath(params.reads_bam).splitText()
reads_ch = Channel.fromPath(params.reads_bam)
.splitCsv()
.map { bamPath -> file(bamPath[0]) }
```

!!! tip
Expand All @@ -783,7 +785,7 @@ Let's run the workflow one more time.

```bash
nextflow run hello-genomics.nf -resume
```
````

This should produce the same result as before, right?

Expand Down
6 changes: 3 additions & 3 deletions hello-nextflow/data/sample_bams.txt
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
/workspace/gitpod/hello-nextflow/data/bam/reads_mother.bam
/workspace/gitpod/hello-nextflow/data/bam/reads_father.bam
/workspace/gitpod/hello-nextflow/data/bam/reads_son.bam
data/bam/reads_mother.bam
data/bam/reads_father.bam
data/bam/reads_son.bam
4 changes: 3 additions & 1 deletion hello-nextflow/hello-config/main.nf
Original file line number Diff line number Diff line change
Expand Up @@ -109,7 +109,9 @@ process GATK_JOINTGENOTYPING {
workflow {

// Create input channel from a text file listing input file paths
reads_ch = Channel.fromPath(params.reads_bam).splitText()
reads_ch = Channel.fromPath(params.reads_bam)
.splitCsv()
.map { bamPath -> file(bamPath[0]) }

// Load the file paths for the accessory files (reference and intervals)
ref_file = file(params.reference)
Expand Down
4 changes: 3 additions & 1 deletion hello-nextflow/hello-modules/main.nf
Original file line number Diff line number Diff line change
Expand Up @@ -96,7 +96,9 @@ process GATK_JOINTGENOTYPING {
workflow {

// Create input channel from a text file listing input file paths
reads_ch = Channel.fromPath(params.reads_bam).splitText()
reads_ch = Channel.fromPath(params.reads_bam)
.splitCsv()
.map { bamPath -> file(bamPath[0]) }

// Load the file paths for the accessory files (reference and intervals)
ref_file = file(params.reference)
Expand Down
4 changes: 3 additions & 1 deletion hello-nextflow/hello-nf-test/main.nf
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,9 @@ include { GATK_JOINTGENOTYPING } from './modules/local/gatk/jointgenotyping/main
workflow {

// Create input channel from a text file listing input file paths
reads_ch = Channel.fromPath(params.reads_bam).splitText()
reads_ch = Channel.fromPath(params.reads_bam)
.splitCsv()
.map { bamPath -> file(bamPath[0]) }

// Load the file paths for the accessory files (reference and intervals)
ref_file = file(params.reference)
Expand Down
4 changes: 3 additions & 1 deletion hello-nextflow/hello-operators.nf
Original file line number Diff line number Diff line change
Expand Up @@ -67,7 +67,9 @@ process GATK_HAPLOTYPECALLER {
workflow {

// Create input channel from a text file listing input file paths
reads_ch = Channel.fromPath(params.reads_bam).splitText()
reads_ch = Channel.fromPath(params.reads_bam)
.splitCsv()
.map { bamPath -> file(bamPath[0]) }

// Load the file paths for the accessory files (reference and intervals)
ref_file = file(params.reference)
Expand Down
6 changes: 4 additions & 2 deletions hello-nextflow/solutions/hello-config/final-main.nf
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ process SAMTOOLS_INDEX {

output:
tuple path(input_bam), path("${input_bam}.bai")

script:
"""
samtools index '$input_bam'
Expand Down Expand Up @@ -96,7 +96,9 @@ process GATK_JOINTGENOTYPING {
workflow {

// Create input channel from a text file listing input file paths
reads_ch = Channel.fromPath(params.reads_bam).splitText()
reads_ch = Channel.fromPath(params.reads_bam)
.splitCsv()
.map { bamPath -> file(bamPath[0]) }

// Load the file paths for the accessory files (reference and intervals)
ref_file = file(params.reference)
Expand Down
6 changes: 4 additions & 2 deletions hello-nextflow/solutions/hello-genomics/hello-genomics-4.nf
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,7 @@ process GATK_HAPLOTYPECALLER {

output:
path "${input_bam}.vcf" , emit: vcf
path "${input_bam}.vcf.idx" , emit: idx
path "${input_bam}.vcf.idx" , emit: idx

script:
"""
Expand All @@ -67,7 +67,9 @@ process GATK_HAPLOTYPECALLER {
workflow {

// Create input channel from a text file listing input file paths
reads_ch = Channel.fromPath(params.reads_bam).splitText()
reads_ch = Channel.fromPath(params.reads_bam)
.splitCsv()
.map { bamPath -> file(bamPath[0]) }

// Load the file paths for the accessory files (reference and intervals)
ref_file = file(params.reference)
Expand Down
4 changes: 3 additions & 1 deletion hello-nextflow/solutions/hello-modules/final-main.nf
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,9 @@ include { GATK_JOINTGENOTYPING } from './modules/local/gatk/jointgenotyping/main
workflow {

// Create input channel from a text file listing input file paths
reads_ch = Channel.fromPath(params.reads_bam).splitText()
reads_ch = Channel.fromPath(params.reads_bam)
.splitCsv()
.map { bamPath -> file(bamPath[0]) }

// Load the file paths for the accessory files (reference and intervals)
ref_file = file(params.reference)
Expand Down
6 changes: 4 additions & 2 deletions hello-nextflow/solutions/hello-operators/hello-operators-1.nf
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,7 @@ process GATK_HAPLOTYPECALLER {

output:
path "${input_bam}.g.vcf" , emit: vcf
path "${input_bam}.g.vcf.idx" , emit: idx
path "${input_bam}.g.vcf.idx" , emit: idx

script:
"""
Expand All @@ -68,7 +68,9 @@ process GATK_HAPLOTYPECALLER {
workflow {

// Create input channel from a text file listing input file paths
reads_ch = Channel.fromPath(params.reads_bam).splitText()
reads_ch = Channel.fromPath(params.reads_bam)
.splitCsv()
.map { bamPath -> file(bamPath[0]) }

// Load the file paths for the accessory files (reference and intervals)
ref_file = file(params.reference)
Expand Down
6 changes: 4 additions & 2 deletions hello-nextflow/solutions/hello-operators/hello-operators-2.nf
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,7 @@ process GATK_HAPLOTYPECALLER {

output:
path "${input_bam}.g.vcf" , emit: vcf
path "${input_bam}.g.vcf.idx" , emit: idx
path "${input_bam}.g.vcf.idx" , emit: idx

script:
"""
Expand Down Expand Up @@ -99,7 +99,9 @@ process GATK_GENOMICSDB {
workflow {

// Create input channel from a text file listing input file paths
reads_ch = Channel.fromPath(params.reads_bam).splitText()
reads_ch = Channel.fromPath(params.reads_bam)
.splitCsv()
.map { bamPath -> file(bamPath[0]) }

// Load the file paths for the accessory files (reference and intervals)
ref_file = file(params.reference)
Expand Down
6 changes: 4 additions & 2 deletions hello-nextflow/solutions/hello-operators/hello-operators-3.nf
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,7 @@ process GATK_HAPLOTYPECALLER {

output:
path "${input_bam}.g.vcf" , emit: vcf
path "${input_bam}.g.vcf.idx" , emit: idx
path "${input_bam}.g.vcf.idx" , emit: idx

script:
"""
Expand Down Expand Up @@ -109,7 +109,9 @@ process GATK_JOINTGENOTYPING {
workflow {

// Create input channel from a text file listing input file paths
reads_ch = Channel.fromPath(params.reads_bam).splitText()
reads_ch = Channel.fromPath(params.reads_bam)
.splitCsv()
.map { bamPath -> file(bamPath[0]) }

// Load the file paths for the accessory files (reference and intervals)
ref_file = file(params.reference)
Expand Down
Loading