Skip to content

Commit 9a4f79d

Browse files
Migration guide for workflow outputs (#6162)
--------- Signed-off-by: Ben Sherman <bentshermann@gmail.com> Signed-off-by: Chris Hakkaart <chris.hakkaart@seqera.io> Co-authored-by: Chris Hakkaart <chris.hakkaart@seqera.io>
1 parent 21a6470 commit 9a4f79d

File tree

8 files changed

+381
-48
lines changed

8 files changed

+381
-48
lines changed

docs/index.md

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -161,9 +161,10 @@ developer/plugins
161161
:caption: Tutorials
162162
:maxdepth: 1
163163
164-
data-lineage
165-
metrics
166-
flux
164+
tutorials/data-lineage
165+
tutorials/workflow-outputs
166+
tutorials/metrics
167+
tutorials/flux
167168
```
168169

169170
```{toctree}

docs/migrations/24-04.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,4 @@
1+
(migrating-24-04-page)=
12

23
# Migrating to 24.04
34

docs/migrations/25-04.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -32,7 +32,7 @@ The third preview of workflow outputs introduces the following breaking changes
3232

3333
- The `mapper` index directive has been removed. Use a `map` operator in the workflwo body instead.
3434

35-
See {ref}`workflow-output-def` to learn more about the workflow output definition.
35+
See {ref}`migrating-workflow-outputs` to get started.
3636

3737
<h3>Topic channels (out of preview)</h3>
3838

@@ -44,7 +44,7 @@ This release introduces built-in provenance tracking, also known as *data lineag
4444

4545
You can explore this lineage from the command line using the {ref}`cli-lineage` command. Additionally, you can refer to files in the lineage store from a Nextflow script using the `lid://` path prefix as well as the {ref}`channel-from-lineage` channel factory.
4646

47-
See the {ref}`data-lineage-page` guide to get started.
47+
See {ref}`data-lineage-page` to get started.
4848

4949
## Enhancements
5050

docs/data-lineage.md renamed to docs/tutorials/data-lineage.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
(data-lineage-page)=
22

3-
# Data lineage
3+
# Getting started with data lineage
44

55
Data lineage in Nextflow provides comprehensive tracking of workflow runs, task executions, and output files. This feature helps you verify the integrity and reproducibility of your pipeline results by maintaining a complete history of computations and intermediate data.
66

docs/flux.md renamed to docs/tutorials/flux.md

Lines changed: 8 additions & 34 deletions
Original file line numberDiff line numberDiff line change
@@ -5,9 +5,9 @@
55
:::{versionadded} 22.11.0-edge
66
:::
77

8-
The [Flux Framework](https://flux-framework.org/) is a modern resource manager that can span the space between cloud and HPC. If your center does not provide Flux for you, you can [build Flux on your own](https://flux-framework.readthedocs.io/en/latest/quickstart.html#building-the-code) and launch it as a job with your resource manager of choice (e.g. SLURM or a cloud provider).
8+
## Overview
99

10-
## Tutorial
10+
The [Flux Framework](https://flux-framework.org/) is a modern resource manager that can span the space between cloud and HPC. If your center does not provide Flux, you can [build Flux yourself](https://flux-framework.readthedocs.io/en/latest/quickstart.html#building-the-code) and launch it as a job using your resource manager of choice (e.g. SLURM or a cloud provider).
1111

1212
In the [`docker/flux`](https://github.com/nextflow-io/nextflow/tree/master/docker/flux) directory we provide a [Dockerfile for interacting with Flux](https://github.com/nextflow-io/nextflow/tree/master/docker/flux/.devcontainer/Dockerfile) along with a [VSCode Developer Container](https://code.visualstudio.com/docs/devcontainers/containers) environment that you can put at the root of the project to be provided with a Flux agent and the dependencies needed to build Nextflow. There are two ways to use this:
1313

@@ -16,7 +16,7 @@ In the [`docker/flux`](https://github.com/nextflow-io/nextflow/tree/master/docke
1616

1717
Both strategies are described below. For this tutorial, you will generally want to prepare a pipeline to use the `flux` executor, create an environment with Flux, start a Flux instance, and interact with it.
1818

19-
### Prepare your pipeline
19+
## Prepare your pipeline
2020

2121
To run your pipeline with Flux, you'll want to specify it in your config. Here is an example `nextflow.config`:
2222

@@ -53,7 +53,7 @@ process haveMeal {
5353
}
5454
```
5555

56-
### Container Environment
56+
## Prepare your environment
5757

5858
You can either build the Docker image from the root of the Nextflow repository:
5959

@@ -81,19 +81,15 @@ $ code .
8181

8282
Then you should be able to open a terminal (**Terminal** -> **New Terminal**) to interact with the command line. Try running `make` again! Whichever of these two approaches you take, you should be in a container environment with the `flux` command available.
8383

84-
### Start a Flux Instance
84+
## Start a Flux instance
8585

8686
Once in your container, you can start an interactive Flux instance (from which you can submit jobs on the command line to test with Nextflow) as follows:
8787

8888
```console
8989
$ flux start --test-size=4
9090
```
9191

92-
#### Getting Familiar with Flux
93-
94-
:::{note}
95-
This step is optional!
96-
:::
92+
### Getting familiar with Flux
9793

9894
Here is an example of submitting a job and getting the log for it.
9995

@@ -125,7 +121,7 @@ $ flux jobs
125121
ƒ4tkMUAAT root sleep R 1 1 2.546s ab6634a491bb
126122
```
127123

128-
### Submitting with Nextflow
124+
## Submitting with Nextflow
129125

130126
Prepare your `nextflow.config` and `demo.nf` in the same directory.
131127

@@ -134,27 +130,7 @@ $ ls .
134130
demo.nf nextflow.config
135131
```
136132

137-
If you've installed Nextflow already, you are good to go! If you are working with development code and need to build Nextflow:
138-
139-
```console
140-
$ make assemble
141-
```
142-
143-
Make sure `nextflow` is on your PATH (here we are in the root of the Nextflow repository):
144-
145-
```console
146-
$ export PATH=$PWD:$PATH
147-
$ which nextflow
148-
/workspaces/nextflow/nextflow
149-
```
150-
151-
Then change to the directory with your config and demo file:
152-
153-
```console
154-
$ cd docker/flux
155-
```
156-
157-
And then run the pipeline with Flux!
133+
Finally, run the pipeline with Flux:
158134

159135
```console
160136
$ nextflow -c nextflow.config run demo.nf
@@ -169,5 +145,3 @@ executor > flux (5)
169145
🥑️ for breakfast!
170146
🥧️ for breakfast!
171147
```
172-
173-
And that's it! You've just run a pipeline using nextflow and Flux.

docs/metrics.md renamed to docs/tutorials/metrics.md

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ This tutorial explains how resource usage metrics are computed from execution re
88

99
CPU Usage plots report how CPU resources are used by each process.
1010

11-
```{image} _static/report-resource-cpu-noheader.png
11+
```{image} ../_static/report-resource-cpu-noheader.png
1212
```
1313

1414
**Raw Usage** tabs are expected to show 100% core usage if processes perform one task of pure computation. If tasks are distributed over, 2, 3, or 4 CPUs, the raw usage will be 200%, 300%, or 400%, respectively. **% Allocated** tabs rescale raw usage values relative to the number of CPUs that are set with the `cpus` directive. If the `cpus` directive is not set, CPUs are set to `1` and **% Allocated** tabs will show the same values **Raw Usage** tabs.
@@ -253,17 +253,17 @@ workflow{
253253

254254
The **Virtual (RAM + Disk swap)** tab shows that both `malloc` and `malloc_fill` use the same amount of virtual memory (~1 GiB):
255255

256-
```{image} _static/report-resource-memory-vmem.png
256+
```{image} ../_static/report-resource-memory-vmem.png
257257
```
258258

259259
However, the **Physical (RAM)** tab shows that `malloc_fill` uses ~1 GiB of RAM while `malloc` uses ~0 GiB of RAM:
260260

261-
```{image} _static/report-resource-memory-ram.png
261+
```{image} ../_static/report-resource-memory-ram.png
262262
```
263263

264264
The **% RAM Allocated** tab shows that `malloc` and `malloc_fill` used 0% and 67% of resources set in the `memory` directive, respectively:
265265

266-
```{image} _static/report-resource-memory-pctram.png
266+
```{image} ../_static/report-resource-memory-pctram.png
267267
```
268268

269269
:::{warning}
@@ -274,7 +274,7 @@ Memory and storage metrics are reported in bytes. For example, 1 KB = $1024$ byt
274274

275275
**Job Duration** plots report how long each process took to run. It has two tabs. The **Raw Usage** tab shows the job duration and the **% Allocated** tab shows the time that was requested relative to what was requested using the `time` directive. Job duration is sometimes known as elapsed real time, real time or wall time.
276276

277-
```{image} _static/report-resource-job-duration.png
277+
```{image} ../_static/report-resource-job-duration.png
278278
```
279279

280280
## I/O Usage
@@ -306,10 +306,10 @@ workflow{
306306

307307
The **Read** tab shows that ~1 Gib and ~256 Mb are read:
308308

309-
```{image} _static/report-resource-io-read.png
309+
```{image} ../_static/report-resource-io-read.png
310310
```
311311

312312
The **Write** tab shows that ~1 Gib and ~256 Mb are written:
313313

314-
```{image} _static/report-resource-io-write.png
314+
```{image} ../_static/report-resource-io-write.png
315315
```

0 commit comments

Comments
 (0)