From 6627bbc491f7133eb37bb9a2cb5a4e85ee2297c9 Mon Sep 17 00:00:00 2001 From: Christopher Hakkaart Date: Mon, 2 Dec 2024 11:52:09 +0100 Subject: [PATCH 01/22] Update examples and language for custom scripts Signed-off-by: Christopher Hakkaart --- docs/module.md | 29 ++++++++++---------- docs/process.md | 71 ++++++++++++++++++++++++++----------------------- 2 files changed, 51 insertions(+), 49 deletions(-) diff --git a/docs/module.md b/docs/module.md index c9dae998ff..f1eb3cf66c 100644 --- a/docs/module.md +++ b/docs/module.md @@ -186,7 +186,7 @@ Ciao world! Process script {ref}`templates ` can be included alongside a module in the `templates` directory. -For example, suppose we have a project L with a module that defines two processes, P1 and P2, both of which use templates. The template files can be made available in the local `templates` directory: +For example, Project L contains a module (`myModules.nf`) that defines two processes, P1 and P2. Both processes use templates that are available in the local `templates` directory: ``` Project L @@ -196,29 +196,29 @@ Project L └── P2-template.sh ``` -Then, we have a second project A with a workflow that includes P1 and P2: +Projects A contains a workflow that includes processes P1 and P2: ``` -Pipeline A +Project A └── main.nf ``` -Finally, we have a third project B with a workflow that also includes P1 and P2: +Pipeline B contains a workflow that also includes process P1 and P2: ``` -Pipeline B +Project B └── main.nf ``` -With the possibility to keep the template files inside the project L, A and B can use the modules defined in L without any changes. A future project C would do the same, just cloning L (if not available on the system) and including its module. +As the template files are stored with the modules inside the Project L, Projects A and B can include them without any changing any code. Future projects would also be able to include these modules by cloning Project L and including its module (if they were not available on the system). -Beside promoting the sharing of modules across pipelines, there are several advantages to keeping the module template under the script path: +Keeping the module template within the script path has several advantages beyond facilitating module sharing across pipelines: 1. Modules are self-contained 2. Modules can be tested independently from the pipeline(s) that import them 3. Modules can be made into libraries -Having multiple template locations enables a structured project organization. If a project has several modules, and they all use templates, the project could group module scripts and their templates as needed. For example: +Organizing templates locations allows for a well-structured project. In projects with multiple modules that rely on templates, you can organize module scripts and their corresponding templates into logical groups. For example: ``` baseDir @@ -240,10 +240,11 @@ baseDir |── mymodules6.nf └── templates |── P5-template.sh - |── P6-template.sh - └── P7-template.sh + └── P6-template.sh ``` +See {ref}`process-template` for more information about how to externalize process scripts to template files. + (module-binaries)= ## Module binaries @@ -253,13 +254,13 @@ baseDir Modules can define binary scripts that are locally scoped to the processes defined by the tasks. -To enable this feature, set the following flag in your pipeline script or configuration file: +To use this feature, the module binaries must be enabled in your pipeline script or configuration file: ```nextflow nextflow.enable.moduleBinaries = true ``` -The binary scripts must be placed in the module directory names `/resources/usr/bin`: +Binary scripts must be placed in the module directory named `/resources/usr/bin` and granted execution permissions: ``` @@ -271,10 +272,8 @@ The binary scripts must be placed in the module directory names `/re └── another-module-script2.py ``` -Those scripts will be made accessible like any other command in the task environment, provided they have been granted the Linux execute permissions. - :::{note} -This feature requires the use of a local or shared file system for the pipeline work directory, or {ref}`wave-page` when using cloud-based executors. +Module binary scripts require a local or shared file system for the pipeline work directory, or {ref}`wave-page` when using cloud-based executors. ::: ## Sharing modules diff --git a/docs/process.md b/docs/process.md index 871ecea093..2743a357cb 100644 --- a/docs/process.md +++ b/docs/process.md @@ -24,11 +24,11 @@ See {ref}`syntax-process` for a full description of the process syntax. ## Script -The `script` block defines, as a string expression, the script that is executed by the process. +The `script` block defines the string expression that is executed by the process. -A process may contain only one script, and if the `script` guard is not explicitly declared, the script must be the final statement in the process block. +The process can contain only one script block. If the `script` guard is not explicitly declared it must be the final statement in the process block. -The script string is executed as a [Bash]() script in the host environment. It can be any command or script that you would normally execute on the command line or in a Bash script. Naturally, the script may only use commands that are available in the host environment. +The script string is executed as a [Bash]() script in the host environment. It can be any command or script that you would execute on the command line or in a Bash script and can only use commands that are available in the host environment. The script block can be a simple string or a multi-line string. The latter approach makes it easier to write scripts with multiple commands spanning multiple lines. For example: @@ -42,19 +42,17 @@ process doMoreThings { } ``` -As explained in the script tutorial section, strings can be defined using single-quotes or double-quotes, and multi-line strings are defined by three single-quote or three double-quote characters. +Strings can be defined using single-quotes or double-quotes. Multi-line strings are defined by three single-quote or three double-quote characters. -There is a subtle but important difference between them. Like in Bash, strings delimited by a `"` character support variable substitutions, while strings delimited by `'` do not. +There is a subtle but important difference between single-quote (`'`) or three double-quote (`"`) characters. Like in Bash, strings delimited by the `"` character support variable substitutions, while strings delimited by `'` do not. -In the above code fragment, the `$db` variable is replaced by the actual value defined elsewhere in the pipeline script. +For example, in the above code fragment, the `$db` variable is replaced by the actual value defined elsewhere in the pipeline script. :::{warning} -Since Nextflow uses the same Bash syntax for variable substitutions in strings, you must manage them carefully depending on whether you want to evaluate a *Nextflow* variable or a *Bash* variable. +Nextflow uses the same Bash syntax for variable substitutions in strings. You must manage them carefully depending on whether you want to evaluate a *Nextflow* variable or a *Bash* variable. ::: -When you need to access a system environment variable in your script, you have two options. - -If you don't need to access any Nextflow variables, you can define your script block with single-quotes: +System environment variables and Nextflow variables can be accessed by your script. If you don't need to access any Nextflow variables, you can define your script block with single-quotes and use the dollar character (`$`) to access system environment variables. For example: ```nextflow process printPath { @@ -64,7 +62,7 @@ process printPath { } ``` -Otherwise, you can define your script with double-quotes and escape the system environment variables by prefixing them with a back-slash `\` character, as shown in the following example: +Otherwise, you can define your script with double-quotes and escape the system environment variables by prefixing them with a back-slash `\` character. For example: ```nextflow process doOtherThings { @@ -76,21 +74,17 @@ process doOtherThings { } ``` -In this example, `$MAX` is a Nextflow variable that must be defined elsewhere in the pipeline script. Nextflow replaces it with the actual value before executing the script. Meanwhile, `$DB` is a Bash variable that must exist in the execution environment, and Bash will replace it with the actual value during execution. - -:::{tip} -Alternatively, you can use the {ref}`process-shell` block definition, which allows a script to contain both Bash and Nextflow variables without having to escape the first. -::: +In this example, `$MAX` is a Nextflow variable that is defined elsewhere in the pipeline script. Nextflow replaces it with the actual value before executing the script. In contrast, `$DB` is a Bash variable that must exist in the execution environment. Bash will replace it with the actual value during execution. ### Scripts *à la carte* -The process script is interpreted by Nextflow as a Bash script by default, but you are not limited to Bash. +The process script is interpreted as Bash by default. -You can use your favourite scripting language (Perl, Python, R, etc), or even mix them in the same pipeline. +However, you can use your favorite scripting language (Perl, Python, R, etc) for each process. You can also mix languages in the same pipeline. -A pipeline may be composed of processes that execute very different tasks. With Nextflow, you can choose the scripting language that best fits the task performed by a given process. For example, for some processes R might be more useful than Perl, whereas for others you may need to use Python because it provides better access to a library or an API, etc. +A pipeline may be composed of processes that execute very different tasks. You can choose the scripting language that best fits the task performed by a given process. For example, R might be more useful than Perl for some processes, whereas for others you may need to use Python because it provides better access to a library or an API. -To use a language other than Bash, simply start your process script with the corresponding [shebang](). For example: +To use a language other than Bash, start your process script with the corresponding [shebang](). For example: ```nextflow process perlTask { @@ -118,12 +112,17 @@ workflow { ``` :::{tip} -Since the actual location of the interpreter binary file can differ across platforms, it is wise to use the `env` command followed by the interpreter name, e.g. `#!/usr/bin/env perl`, instead of the absolute path, in order to make your script more portable. +As the location of the interpreter binary file can differ across platforms. Use the `env` command followed by the interpreter name to make your script more portable. For example: + +```nextflow +#!/usr/bin/env perl +``` + ::: ### Conditional scripts -The `script` block is like a function that returns a string. This means that you can write arbitrary code to determine the script, as long as the final statement is a string. +The `script` block is like a function that returns a string. You can write arbitrary code to determine the script as long as the final statement is a string. If-else statements based on task inputs can be used to produce a different script. For example: @@ -155,15 +154,13 @@ process align { } ``` -In the above example, the process will execute one of several scripts depending on the value of the `mode` parameter. By default it will execute the `tcoffee` command. +In the above example, the process will execute one of several scripts depending on the value of the `mode` parameter. By default, the process will execute the `tcoffee` command. (process-template)= ### Template -Process scripts can be externalized to **template** files, which allows them to be reused across different processes and tested independently from the pipeline execution. - -A template can be used in place of an embedded script using the `template` function in the script section: +Process scripts can be externalized to **template** files and accessed using the `template` function in the script section. For example: ```nextflow process templateExample { @@ -179,9 +176,9 @@ workflow { } ``` -By default, Nextflow looks for the template script in the `templates` directory located alongside the Nextflow script in which the process is defined. An absolute path can be used to specify a different location. However, this practice is discouraged because it hinders pipeline portability. +By default, Nextflow looks for template scripts in the `templates` directory, located alongside the Nextflow script that defines the process. A template can be reused across multiple processes. An absolute path can be used to specify a different template location. However, this practice is discouraged because it hinders pipeline portability. -An example template script is provided below: +Templates can be tested independently of pipeline execution. Consider the following template script: ```bash #!/bin/bash @@ -190,22 +187,28 @@ echo $STR echo "process completed" ``` -Variables prefixed with the dollar character (`$`) are interpreted as Nextflow variables when the template script is executed by Nextflow and Bash variables when executed directly. For example, the above script can be executed from the command line by providing each input as an environment variable: +The above script can be executed from the command line by providing each input as an environment variable. ```bash STR='foo' bash templates/my_script.sh ``` -The following caveats should be considered: +Variables prefixed with the dollar character (`$`) are interpreted as Nextflow variables when the template script is executed by Nextflow and Bash variables when executed directly. + +The following caveats should be considered when using templates: + +- Template scripts are only recommended for Bash scripts. + +- Languages that do not prefix variables with `$` (e.g. Python and R) can't be executed directly as a template script from the command line. -- Template scripts are recommended only for Bash scripts. Languages that do not prefix variables with `$` (e.g. Python and R) can't be executed directly as a template script. +- Template variables escaped with `\$` will be interpreted as Bash variables when executed by Nextflow but not the command line. -- Variables escaped with `\$` will be interpreted as Bash variables when executed by Nextflow, but will not be interpreted as variables when executed from the command line. This practice should be avoided to ensure that the template script behaves consistently. +- Template variables are evaluated even if they are commented out in the template script. -- Template variables are evaluated even if they are commented out in the template script. If a template variable is missing, it will cause the pipeline to fail regardless of where it occurs in the template. +- The pipeline to fail if a template variable is missing, regardless of where it occurs in the template. :::{tip} -Template scripts are generally discouraged due to the caveats described above. The best practice for using a custom script is to embed it in the process definition at first and move it to a separate file with its own command line interface once the code matures. +The best practice for using a custom script is to first embed it in the process definition and transfer it to a separate file with its own command line interface once the code matures. ::: (process-shell)= From aac6dfe2568148b00bce43425d44ffd5cba206d9 Mon Sep 17 00:00:00 2001 From: Christopher Hakkaart Date: Mon, 2 Dec 2024 13:16:41 +0100 Subject: [PATCH 02/22] Update text after modifying bin documentation Signed-off-by: Christopher Hakkaart --- docs/module.md | 4 +++- docs/process.md | 12 +++--------- docs/sharing.md | 26 +++++++++++++++----------- 3 files changed, 21 insertions(+), 21 deletions(-) diff --git a/docs/module.md b/docs/module.md index f1eb3cf66c..6f49bf3817 100644 --- a/docs/module.md +++ b/docs/module.md @@ -260,7 +260,7 @@ To use this feature, the module binaries must be enabled in your pipeline script nextflow.enable.moduleBinaries = true ``` -Binary scripts must be placed in the module directory named `/resources/usr/bin` and granted execution permissions: +Binary scripts must be placed in the module directory named `/resources/usr/bin` and granted execution permissions. For example: ``` @@ -276,6 +276,8 @@ Binary scripts must be placed in the module directory named `/resour Module binary scripts require a local or shared file system for the pipeline work directory, or {ref}`wave-page` when using cloud-based executors. ::: +Scripts can also be stored at the pipeline level using the `bin` directory. See {ref}`bin directory ` for more information. + ## Sharing modules Modules are designed to be easy to share and re-use across different pipelines, which helps eliminate duplicate work and spread improvements throughout the community. While Nextflow does not provide an explicit mechanism for sharing modules, there are several ways to do it: diff --git a/docs/process.md b/docs/process.md index 2743a357cb..c4d503225c 100644 --- a/docs/process.md +++ b/docs/process.md @@ -178,7 +178,7 @@ workflow { By default, Nextflow looks for template scripts in the `templates` directory, located alongside the Nextflow script that defines the process. A template can be reused across multiple processes. An absolute path can be used to specify a different template location. However, this practice is discouraged because it hinders pipeline portability. -Templates can be tested independently of pipeline execution. Consider the following template script: +Templates can be tested independently of pipeline execution. However, variables prefixed with the dollar character (`$`) are interpreted as Nextflow variables when the template script is executed by Nextflow and Bash variables when executed directly. Consider the following template script: ```bash #!/bin/bash @@ -187,24 +187,18 @@ echo $STR echo "process completed" ``` -The above script can be executed from the command line by providing each input as an environment variable. +The above script can be executed from the command line by providing each input as an environment variable: ```bash STR='foo' bash templates/my_script.sh ``` -Variables prefixed with the dollar character (`$`) are interpreted as Nextflow variables when the template script is executed by Nextflow and Bash variables when executed directly. - -The following caveats should be considered when using templates: +Several caveats should be considered when using templates: - Template scripts are only recommended for Bash scripts. - - Languages that do not prefix variables with `$` (e.g. Python and R) can't be executed directly as a template script from the command line. - - Template variables escaped with `\$` will be interpreted as Bash variables when executed by Nextflow but not the command line. - - Template variables are evaluated even if they are commented out in the template script. - - The pipeline to fail if a template variable is missing, regardless of where it occurs in the template. :::{tip} diff --git a/docs/sharing.md b/docs/sharing.md index 61a840e9f8..b30cf949b5 100644 --- a/docs/sharing.md +++ b/docs/sharing.md @@ -97,23 +97,27 @@ For maximal reproducibility, make sure to define a specific version for each too #### The `bin` directory -As for custom scripts, you can include executable scripts in the `bin` directory of your pipeline repository. When configured correctly, these scripts can be executed like a regular command from any process script (i.e. without modifying the `PATH` environment variable or using an absolute path), and changing the script will cause the task to be re-executed on a resumed run (i.e. just like changing the process script itself). +Executable scripts can be included in the pipeline `bin` directory located at the root of your pipeline directory. This allows you to create and organize custom scripts that can be invoked like regular commands from any process in your pipeline without modifying the `PATH` environment variable or using an absolute path. For example: -To configure a custom script: +``` +├── bin +│ └── custom_script.py +└── main.nf +``` -1. Save the script in the `bin` directory (relative to the pipeline repository root). -2. Specify a portable shebang (see note below for details). -3. Make the script executable. For example: `chmod a+x bin/my_script.py` +Each script should include a shebang line to specify the interpreter for the script. To maximize portability, use `env` to dynamically resolve the interpreter's location instead of hard-coding the interpreter path. -:::{tip} -To maximize the portability of your bundled script, use `env` to dynamically resolve the location of the interpreter instead of hard-coding it in the shebang line. +For example, the shebang definitions `#!/usr/bin/python` and `#!/usr/local/bin/python` hard-code specific paths to the Python interpreter. Use `#!/usr/bin/env python` instead. -For example, shebang definitions `#!/usr/bin/python` and `#!/usr/local/bin/python` both hard-code specific paths to the Python interpreter. Instead, the following approach is more portable: +Scripts placed in the `bin` directory must have executable permissions. Use the `chmod` command to grant the required permissions. For example: -```bash -#!/usr/bin/env python ``` -::: +chmod a+x bin/custom_script.py +``` + +After setting the executable permission, the script can be run directly within your pipeline processes. + +Executable scripts can also be stored as scripts that are locally scoped to the processes defined by the tasks. See {ref}`module-binaries` for more information. #### The `lib` directory From 682218850e6a544abfd2bbdacc35887cc40a2a52 Mon Sep 17 00:00:00 2001 From: Christopher Hakkaart Date: Mon, 2 Dec 2024 14:00:35 +0100 Subject: [PATCH 03/22] Mirror notes Signed-off-by: Christopher Hakkaart --- docs/process.md | 10 +++++----- docs/sharing.md | 11 +++++++++-- 2 files changed, 14 insertions(+), 7 deletions(-) diff --git a/docs/process.md b/docs/process.md index c4d503225c..d568dc361d 100644 --- a/docs/process.md +++ b/docs/process.md @@ -89,7 +89,7 @@ To use a language other than Bash, start your process script with the correspond ```nextflow process perlTask { """ - #!/usr/bin/perl + #!/usr/bin/env perl print 'Hi there!' . '\n'; """ @@ -97,7 +97,7 @@ process perlTask { process pythonTask { """ - #!/usr/bin/python + #!/usr/bin/env python x = 'Hello' y = 'world!' @@ -112,10 +112,10 @@ workflow { ``` :::{tip} -As the location of the interpreter binary file can differ across platforms. Use the `env` command followed by the interpreter name to make your script more portable. For example: +As the location of the interpreter binary file can differ across platforms. Use `env` to resolve the interpreter's location instead of hard-coding the interpreter path. For example: -```nextflow -#!/usr/bin/env perl +``` +#!/usr/bin/env python ``` ::: diff --git a/docs/sharing.md b/docs/sharing.md index b30cf949b5..ac4ac154e3 100644 --- a/docs/sharing.md +++ b/docs/sharing.md @@ -105,9 +105,16 @@ Executable scripts can be included in the pipeline `bin` directory located at th └── main.nf ``` -Each script should include a shebang line to specify the interpreter for the script. To maximize portability, use `env` to dynamically resolve the interpreter's location instead of hard-coding the interpreter path. +Each script should include a shebang line to specify the interpreter for the script. -For example, the shebang definitions `#!/usr/bin/python` and `#!/usr/local/bin/python` hard-code specific paths to the Python interpreter. Use `#!/usr/bin/env python` instead. +:::{tip} +As the location of the interpreter binary file can differ across platforms. Use `env` to resolve the interpreter's location instead of hard-coding the interpreter path. For example: + +``` +#!/usr/bin/env python +``` + +::: Scripts placed in the `bin` directory must have executable permissions. Use the `chmod` command to grant the required permissions. For example: From f568c0ef9a9814390dbcdc0809e5a894e193b538 Mon Sep 17 00:00:00 2001 From: Christopher Hakkaart Date: Mon, 2 Dec 2024 14:14:02 +0100 Subject: [PATCH 04/22] Proof read edits Signed-off-by: Christopher Hakkaart --- docs/module.md | 14 +++++++------- docs/process.md | 6 +++--- docs/sharing.md | 4 ++-- 3 files changed, 12 insertions(+), 12 deletions(-) diff --git a/docs/module.md b/docs/module.md index 6f49bf3817..ae6aa03028 100644 --- a/docs/module.md +++ b/docs/module.md @@ -196,14 +196,14 @@ Project L └── P2-template.sh ``` -Projects A contains a workflow that includes processes P1 and P2: +Project A contains a workflow that includes processes P1 and P2: ``` Project A └── main.nf ``` -Pipeline B contains a workflow that also includes process P1 and P2: +Project B contains a workflow that also includes process P1 and P2: ``` Project B @@ -212,11 +212,11 @@ Project B As the template files are stored with the modules inside the Project L, Projects A and B can include them without any changing any code. Future projects would also be able to include these modules by cloning Project L and including its module (if they were not available on the system). -Keeping the module template within the script path has several advantages beyond facilitating module sharing across pipelines: +Beyond facilitating module sharing across pipelines, keeping the module template within the script path has several advantages, including: -1. Modules are self-contained -2. Modules can be tested independently from the pipeline(s) that import them -3. Modules can be made into libraries +- Modules are self-contained. +- Modules can be tested independently from the pipeline(s) that import them. +- Modules can be made into libraries. Organizing templates locations allows for a well-structured project. In projects with multiple modules that rely on templates, you can organize module scripts and their corresponding templates into logical groups. For example: @@ -276,7 +276,7 @@ Binary scripts must be placed in the module directory named `/resour Module binary scripts require a local or shared file system for the pipeline work directory, or {ref}`wave-page` when using cloud-based executors. ::: -Scripts can also be stored at the pipeline level using the `bin` directory. See {ref}`bin directory ` for more information. +Scripts can also be stored at the pipeline level using the `bin` directory. See {ref}`bundling-executables` for more information. ## Sharing modules diff --git a/docs/process.md b/docs/process.md index d568dc361d..47c74fa35c 100644 --- a/docs/process.md +++ b/docs/process.md @@ -112,7 +112,7 @@ workflow { ``` :::{tip} -As the location of the interpreter binary file can differ across platforms. Use `env` to resolve the interpreter's location instead of hard-coding the interpreter path. For example: +Use `env` to resolve the interpreter's location instead of hard-coding the interpreter path. For example: ``` #!/usr/bin/env python @@ -160,7 +160,7 @@ In the above example, the process will execute one of several scripts depending ### Template -Process scripts can be externalized to **template** files and accessed using the `template` function in the script section. For example: +Process scripts can be externalized to **template** files and reused across multiple processes. Templates can be accessed using the `template` function in the script section. For example: ```nextflow process templateExample { @@ -176,7 +176,7 @@ workflow { } ``` -By default, Nextflow looks for template scripts in the `templates` directory, located alongside the Nextflow script that defines the process. A template can be reused across multiple processes. An absolute path can be used to specify a different template location. However, this practice is discouraged because it hinders pipeline portability. +By default, Nextflow looks for template scripts in the `templates` directory, located alongside the Nextflow script that defines the process. An absolute path can be used to specify a different template location. However, this practice is discouraged because it hinders pipeline portability. Templates can be tested independently of pipeline execution. However, variables prefixed with the dollar character (`$`) are interpreted as Nextflow variables when the template script is executed by Nextflow and Bash variables when executed directly. Consider the following template script: diff --git a/docs/sharing.md b/docs/sharing.md index ac4ac154e3..2424a07b5c 100644 --- a/docs/sharing.md +++ b/docs/sharing.md @@ -108,7 +108,7 @@ Executable scripts can be included in the pipeline `bin` directory located at th Each script should include a shebang line to specify the interpreter for the script. :::{tip} -As the location of the interpreter binary file can differ across platforms. Use `env` to resolve the interpreter's location instead of hard-coding the interpreter path. For example: +Use `env` to resolve the interpreter's location instead of hard-coding the interpreter path. For example: ``` #!/usr/bin/env python @@ -116,7 +116,7 @@ As the location of the interpreter binary file can differ across platforms. Use ::: -Scripts placed in the `bin` directory must have executable permissions. Use the `chmod` command to grant the required permissions. For example: +Scripts placed in the `bin` directory must have executable permissions. Use `chmod` to grant the required permissions. For example: ``` chmod a+x bin/custom_script.py From 9a2424ef10772d7531edd1083bd2d3ca66b0019d Mon Sep 17 00:00:00 2001 From: Christopher Hakkaart Date: Tue, 3 Dec 2024 15:04:05 +0100 Subject: [PATCH 05/22] New section Signed-off-by: Christopher Hakkaart --- docs/index.md | 1 + docs/process.md | 38 ----------- docs/sharing.md | 37 ---------- docs/structure.md | 170 ++++++++++++++++++++++++++++++++++++++++++++++ 4 files changed, 171 insertions(+), 75 deletions(-) create mode 100644 docs/structure.md diff --git a/docs/index.md b/docs/index.md index f1585124d4..2cc132b783 100644 --- a/docs/index.md +++ b/docs/index.md @@ -78,6 +78,7 @@ module notifications secrets sharing +structure vscode dsl1 ``` diff --git a/docs/process.md b/docs/process.md index 47c74fa35c..a9f9d36482 100644 --- a/docs/process.md +++ b/docs/process.md @@ -162,48 +162,10 @@ In the above example, the process will execute one of several scripts depending Process scripts can be externalized to **template** files and reused across multiple processes. Templates can be accessed using the `template` function in the script section. For example: -```nextflow -process templateExample { - input: - val STR - - script: - template 'my_script.sh' -} - -workflow { - Channel.of('this', 'that') | templateExample -} -``` - By default, Nextflow looks for template scripts in the `templates` directory, located alongside the Nextflow script that defines the process. An absolute path can be used to specify a different template location. However, this practice is discouraged because it hinders pipeline portability. Templates can be tested independently of pipeline execution. However, variables prefixed with the dollar character (`$`) are interpreted as Nextflow variables when the template script is executed by Nextflow and Bash variables when executed directly. Consider the following template script: -```bash -#!/bin/bash -echo "process started at `date`" -echo $STR -echo "process completed" -``` - -The above script can be executed from the command line by providing each input as an environment variable: - -```bash -STR='foo' bash templates/my_script.sh -``` - -Several caveats should be considered when using templates: - -- Template scripts are only recommended for Bash scripts. -- Languages that do not prefix variables with `$` (e.g. Python and R) can't be executed directly as a template script from the command line. -- Template variables escaped with `\$` will be interpreted as Bash variables when executed by Nextflow but not the command line. -- Template variables are evaluated even if they are commented out in the template script. -- The pipeline to fail if a template variable is missing, regardless of where it occurs in the template. - -:::{tip} -The best practice for using a custom script is to first embed it in the process definition and transfer it to a separate file with its own command line interface once the code matures. -::: (process-shell)= diff --git a/docs/sharing.md b/docs/sharing.md index 2424a07b5c..d1c1755779 100644 --- a/docs/sharing.md +++ b/docs/sharing.md @@ -93,43 +93,6 @@ Read the {ref}`container-page` page to learn more about how to use containers wi For maximal reproducibility, make sure to define a specific version for each tool. Otherwise, your pipeline might use different versions across subsequent runs, which can introduce subtle differences to your results. ::: -(bundling-executables)= - -#### The `bin` directory - -Executable scripts can be included in the pipeline `bin` directory located at the root of your pipeline directory. This allows you to create and organize custom scripts that can be invoked like regular commands from any process in your pipeline without modifying the `PATH` environment variable or using an absolute path. For example: - -``` -├── bin -│ └── custom_script.py -└── main.nf -``` - -Each script should include a shebang line to specify the interpreter for the script. - -:::{tip} -Use `env` to resolve the interpreter's location instead of hard-coding the interpreter path. For example: - -``` -#!/usr/bin/env python -``` - -::: - -Scripts placed in the `bin` directory must have executable permissions. Use `chmod` to grant the required permissions. For example: - -``` -chmod a+x bin/custom_script.py -``` - -After setting the executable permission, the script can be run directly within your pipeline processes. - -Executable scripts can also be stored as scripts that are locally scoped to the processes defined by the tasks. See {ref}`module-binaries` for more information. - -#### The `lib` directory - -Any Groovy scripts or JAR files in the `lib` directory will be automatically loaded and made available to your pipeline scripts. The `lib` directory is a useful way to provide utility code or external libraries without cluttering the pipeline scripts. - ### Data In general, input data should be provided by external sources using parameters which can be controlled by the user. This way, a pipeline can be easily reused to process different datasets which are appropriate for the pipeline. diff --git a/docs/structure.md b/docs/structure.md new file mode 100644 index 0000000000..da5bca404b --- /dev/null +++ b/docs/structure.md @@ -0,0 +1,170 @@ +(structure-page)= + +# Structure + +## The `templates` directory + +The `templates` directory in the Nextflow project root can be used to store scripts. + +``` +├── templates +│ └── sayhello.py +└── main.nf +``` + +It allows custom scripts to be invoked like regular scripts from any process in your pipeline using the `template` function: + +``` +process sayHello { + + input: + val x + + output: + stdout + + script: + template 'sayhello.py' +} + +workflow { + Channel.of("Foo") | sayHello | view +} +``` + +Variables prefixed with the dollar character (`$`) are interpreted as Nextflow variables when the template script is executed by Nextflow: + +``` +#!/usr/bin/env python + +print("Hello ${x}!") +``` + +The pipeline will fail if a template variable is missing, regardless of where it occurs in the template. + +Templates can be tested independently of pipeline execution by providing each input as an environment variable. For example: + +```bash +STR='foo' bash templates/my_script.sh +``` + +Template scripts are only recommended for Bash scripts. Languages that do not prefix variables with `$` (e.g. Python and R) can't be executed directly as a template script from the command line as variables prefixed with `$` are interpreted as Bash variables. Similarly, template variables escaped with `\$` will be interpreted as Bash variables when executed by Nextflow but not the command line. + +:::{warning} +Template variables are evaluated even if they are commented out in the template script. +::: + +:::{tip} +The best practice for using a custom script is to first embed it in the process definition and transfer it to a separate file with its own command line interface once the code matures. +::: + +(bundling-executables)= + +## The `bin` directory + +The `bin` directory in the Nextflow project root can be used to store executable scripts. + +``` +├── bin +│ └── sayhello.py +└── main.nf +``` + +It allows custom scripts to be invoked like regular commands from any process in your pipeline without modifying the `PATH` environment variable or using an absolute path. Each script should include a shebang to specify the interpreter. Inputs should be supplied as arguments. + +```python +#!/usr/bin/env python + +import argparse + +def main(): + parser = argparse.ArgumentParser(description="A simple argparse example.") + parser.add_argument("name", type=str, help="Person to greet.") + + args = parser.parse_args() + print(f"Hello {args.name}!") + +if __name__ == "__main__": + main() +``` + +:::{tip} +Use `env` to resolve the interpreter's location instead of hard-coding the interpreter path. +::: + +Scripts placed in the `bin` directory must have executable permissions. Use `chmod` to grant the required permissions. For example: + +``` +chmod a+x bin/sayhello.py +``` + +Like modifying a process script, changing the executable script will cause the task to be re-executed on a resumed run. + +:::{warning} +When using containers and the Wave service, Nextflow will send the project-level `bin` directory to the Wave service for inclusion as a layer in the container. Any changes to scripts in the `bin` directory will change the layer md5sum and the hash for the final container. The container identity is a component of the task hash calculation and will force re-calculation of all tasks in the workflow. + +When using the Wave service, use module-specific bin directories instead. See {ref}`module-binaries` for more information. +::: + +## The `lib` directory + +The `lib` directory can be used to add utility code or external libraries without cluttering the pipeline scripts. The `lib` directory in the Nextflow project root is added to the classpath by default. + +``` +├── lib +│ └── DNASequence.groovy +└── main.nf +``` + +Classes or packages defined in the `lib` directory will be available in the execution context. Scripts or functions defined outside of classes will not be available in the execution context. + +For example, `lib/DNASequence.groovy` defines the `DNASequence` class: + +```groovy +// lib/DNASequence.groovy +class DNASequence { + String sequence + + // Constructor + DNASequence(String sequence) { + this.sequence = sequence.toUpperCase() // Ensure sequence is in uppercase for consistency + } + + // Method to calculate melting temperature using the Wallace rule + double getMeltingTemperature() { + int g_count = sequence.count('G') + int c_count = sequence.count('C') + int a_count = sequence.count('A') + int t_count = sequence.count('T') + + // Wallace rule calculation + double tm = 4 * (g_count + c_count) + 2 * (a_count + t_count) + return tm + } + + String toString() { + return "DNA[$sequence]" + } +} +``` + +The `DNASequence` class is available in the execution context: + +```nextflow +// main.nf +workflow { + Channel.of('ACGTTGCAATGCCGTA', 'GCGTACGGTACGTTAC') + .map { seq -> new DNASequence(seq) } + .view { dna -> + def meltTemp = dna.getMeltingTemperature() + "Found sequence '$dna' with melting temperature ${meltTemp}°C" + } +} +``` + +It returns: + +``` +Found sequence 'DNA[ACGTTGCAATGCCGTA]' with melting temperaure 48.0°C +Found sequence 'DNA[GCGTACGGTACGTTAC]' with melting temperaure 50.0°C +``` From e1795eafd4c135f04ce0049f23e8a71b8977f43c Mon Sep 17 00:00:00 2001 From: Ben Sherman Date: Tue, 26 Nov 2024 15:25:07 -0600 Subject: [PATCH 06/22] Update syntax docs (#5542) Signed-off-by: Ben Sherman Signed-off-by: Christopher Hakkaart --- docs/reference/syntax.md | 10 ----- docs/vscode.md | 90 +++++++++++++++++++++++++--------------- 2 files changed, 57 insertions(+), 43 deletions(-) diff --git a/docs/reference/syntax.md b/docs/reference/syntax.md index e454756646..06d7824527 100644 --- a/docs/reference/syntax.md +++ b/docs/reference/syntax.md @@ -622,16 +622,6 @@ A *slashy string* is enclosed by slashes instead of quotes: /no escape!/ ``` -Slashy strings can also span multiple lines: - -```nextflow -/ -Patterns in the code, -Symbols dance to match and find, -Logic unconfined. -/ -``` - :::{note} A slashy string cannot be empty because it would become a line comment. ::: diff --git a/docs/vscode.md b/docs/vscode.md index eb0d6d40eb..11c847ae29 100644 --- a/docs/vscode.md +++ b/docs/vscode.md @@ -230,38 +230,6 @@ if (aligner == 'bowtie2') { } ``` -**Slashy dollar strings** - -Groovy supports a wide variety of strings, including multi-line strings, dynamic strings, slashy strings, multi-line dynamic slashy strings, and more. - -The Nextflow language specification supports single- and double-quoted strings, multi-line strings, and slashy strings. Dynamic slashy strings are not supported: - -```groovy -def logo = /--cl-config 'custom_logo: "${multiqc_logo}"'/ -``` - -Use a double-quoted string instead: - -```nextflow -def logo = "--cl-config 'custom_logo: \"${multiqc_logo}\"'" -``` - -Slashy dollar strings are not supported: - -```groovy -$/ -echo "Hello world!" -/$ -``` - -Use a multi-line string instead: - -```nextflow -""" -echo "Hello world!" -""" -``` - **Implicit environment variables** In Nextflow DSL1 and DSL2, you can reference environment variables directly in strings: @@ -334,6 +302,62 @@ To ease the migration of existing scripts, the language server only reports warn Type annotations and static type checking will be addressed in a future version of the Nextflow language specification. ::: +**Strings** + +Groovy supports a wide variety of strings, including multi-line strings, dynamic strings, slashy strings, multi-line dynamic slashy strings, and more. + +The Nextflow language specification supports single- and double-quoted strings, multi-line strings, and slashy strings. + +Slashy strings cannot be interpolated: + +```nextflow +def id = 'SRA001' +assert 'SRA001.fastq' ~= /${id}\.f(?:ast)?q/ +``` + +Use a double-quoted string instead: + +```nextflow +def id = 'SRA001' +assert 'SRA001.fastq' ~= "${id}\\.f(?:ast)?q" +``` + +Slashy strings cannot span multiple lines: + +```groovy +/ +Patterns in the code, +Symbols dance to match and find, +Logic unconfined. +/ +``` + +Use a multi-line string instead: + +```nextflow +""" +Patterns in the code, +Symbols dance to match and find, +Logic unconfined. +""" +``` + +Dollar slashy strings are not supported: + +```groovy +$/ +echo "Hello world!" +/$ +``` + +Use a multi-line string instead: + +```nextflow +""" +echo "Hello world!" +""" +``` + **Process env inputs/outputs** In Nextflow DSL1 and DSL2, the name of a process `env` input/output can be specified with or without quotes: @@ -481,7 +505,7 @@ includeConfig ({ return 'large.config' else return '/dev/null' -})() +}()) ``` The include source is a closure that is immediately invoked. It includes a different config file based on the return value of the closure. Including `/dev/null` is equivalent to including nothing. From 6d703dbf44da507a6227702365f14bc736b7b2f5 Mon Sep 17 00:00:00 2001 From: Paolo Di Tommaso Date: Wed, 27 Nov 2024 15:55:57 +0100 Subject: [PATCH 07/22] Prevent NPE with null AWS Batch response Signed-off-by: Paolo Di Tommaso Signed-off-by: Christopher Hakkaart --- .../main/nextflow/cloud/aws/batch/AwsBatchTaskHandler.groovy | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/plugins/nf-amazon/src/main/nextflow/cloud/aws/batch/AwsBatchTaskHandler.groovy b/plugins/nf-amazon/src/main/nextflow/cloud/aws/batch/AwsBatchTaskHandler.groovy index 4245821a7a..29a2261e25 100644 --- a/plugins/nf-amazon/src/main/nextflow/cloud/aws/batch/AwsBatchTaskHandler.groovy +++ b/plugins/nf-amazon/src/main/nextflow/cloud/aws/batch/AwsBatchTaskHandler.groovy @@ -198,7 +198,7 @@ class AwsBatchTaskHandler extends TaskHandler implements BatchHandler Date: Wed, 27 Nov 2024 22:12:10 +0100 Subject: [PATCH 08/22] Update wave deps Signed-off-by: Paolo Di Tommaso Signed-off-by: Christopher Hakkaart --- plugins/nf-wave/build.gradle | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/plugins/nf-wave/build.gradle b/plugins/nf-wave/build.gradle index bd5da121f0..b6a8367f0d 100644 --- a/plugins/nf-wave/build.gradle +++ b/plugins/nf-wave/build.gradle @@ -36,8 +36,8 @@ dependencies { api 'org.apache.commons:commons-lang3:3.12.0' api 'com.google.code.gson:gson:2.10.1' api 'org.yaml:snakeyaml:2.2' - api 'io.seqera:wave-api:0.13.3' - api 'io.seqera:wave-utils:0.14.1' + api 'io.seqera:wave-api:0.14.0' + api 'io.seqera:wave-utils:0.15.0' testImplementation(testFixtures(project(":nextflow"))) testImplementation "org.apache.groovy:groovy:4.0.24" From de31b6d18eec5df074961d09887c5fc6ab20daec Mon Sep 17 00:00:00 2001 From: Paolo Di Tommaso Date: Wed, 27 Nov 2024 22:13:09 +0100 Subject: [PATCH 09/22] Fix missing wave response (#5547) [ci fast] Signed-off-by: Paolo Di Tommaso Signed-off-by: Christopher Hakkaart --- .../io/seqera/wave/plugin/WaveClient.groovy | 17 ++++++++++++----- .../io/seqera/wave/plugin/WaveClientTest.groovy | 17 ++++++++--------- 2 files changed, 20 insertions(+), 14 deletions(-) diff --git a/plugins/nf-wave/src/main/io/seqera/wave/plugin/WaveClient.groovy b/plugins/nf-wave/src/main/io/seqera/wave/plugin/WaveClient.groovy index 6c8bb44e8e..c3640ce42d 100644 --- a/plugins/nf-wave/src/main/io/seqera/wave/plugin/WaveClient.groovy +++ b/plugins/nf-wave/src/main/io/seqera/wave/plugin/WaveClient.groovy @@ -27,6 +27,7 @@ import java.time.Duration import java.time.Instant import java.time.OffsetDateTime import java.time.temporal.ChronoUnit +import java.util.concurrent.ConcurrentHashMap import java.util.concurrent.Executors import java.util.concurrent.TimeUnit import java.util.function.Predicate @@ -104,7 +105,9 @@ class WaveClient { final private String endpoint - private Cache cache + private Cache cache + + private Map responses = new ConcurrentHashMap<>() private Session session @@ -135,7 +138,7 @@ class WaveClient { this.packer = new Packer().withPreserveTimestamp(config.preserveFileTimestamp()) this.waveRegistry = new URI(endpoint).getAuthority() // create cache - cache = CacheBuilder + this.cache = CacheBuilder .newBuilder() .expireAfterWrite(config.tokensCacheMaxDuration().toSeconds(), TimeUnit.SECONDS) .build() @@ -572,8 +575,12 @@ class WaveClient { final key = assets.fingerprint() log.trace "Wave fingerprint: $key; assets: $assets" // get from cache or submit a new request - final handle = cache.get(key, () -> new Handle(sendRequest(assets),Instant.now()) ) - return new ContainerInfo(assets.containerImage, handle.response.targetImage, key) + final resp = cache.get(key, () -> { + final ret = sendRequest(assets); + responses.put(key,new Handle(ret,Instant.now())); + return ret + }) + return new ContainerInfo(assets.containerImage, resp.targetImage, key) } catch ( UncheckedExecutionException e ) { throw e.cause @@ -633,7 +640,7 @@ class WaveClient { } boolean isContainerReady(String key) { - final handle = cache.getIfPresent(key) + final handle = responses.get(key) if( !handle ) throw new IllegalStateException("Unable to find any container with key: $key") final resp = handle.response diff --git a/plugins/nf-wave/src/test/io/seqera/wave/plugin/WaveClientTest.groovy b/plugins/nf-wave/src/test/io/seqera/wave/plugin/WaveClientTest.groovy index bbd0a397b6..1f54b0a3d7 100644 --- a/plugins/nf-wave/src/test/io/seqera/wave/plugin/WaveClientTest.groovy +++ b/plugins/nf-wave/src/test/io/seqera/wave/plugin/WaveClientTest.groovy @@ -27,7 +27,6 @@ import java.nio.file.attribute.FileTime import java.time.Duration import java.time.Instant -import com.google.common.cache.Cache import com.sun.net.httpserver.HttpExchange import com.sun.net.httpserver.HttpHandler import com.sun.net.httpserver.HttpServer @@ -1303,18 +1302,18 @@ class WaveClientTest extends Specification { def 'should validate isContainerReady' () { given: def sess = Mock(Session) {getConfig() >> [wave: [build:[maxDuration: '500ms']]] } - def cache = Mock(Cache) + def cache = Mock(Map) and: def resp = Mock(SubmitContainerTokenResponse) def handle = new WaveClient.Handle(resp,Instant.now()) - def wave = Spy(new WaveClient(session:sess, cache: cache)) + def wave = Spy(new WaveClient(session:sess, responses: cache)) boolean ready // container succeeded when: ready = wave.isContainerReady('xyz') then: - cache.getIfPresent('xyz') >> handle + cache.get('xyz') >> handle and: resp.requestId >> '12345' resp.succeeded >> true @@ -1328,7 +1327,7 @@ class WaveClientTest extends Specification { when: ready = wave.isContainerReady('xyz') then: - cache.getIfPresent('xyz') >> handle + cache.get('xyz') >> handle and: resp.requestId >> '12345' resp.succeeded >> null @@ -1342,7 +1341,7 @@ class WaveClientTest extends Specification { when: ready = wave.isContainerReady('xyz') then: - cache.getIfPresent('xyz') >> handle + cache.get('xyz') >> handle and: resp.requestId >> '12345' resp.succeeded >> false @@ -1357,7 +1356,7 @@ class WaveClientTest extends Specification { when: ready = wave.isContainerReady('xyz') then: - cache.getIfPresent('xyz') >> handle + cache.get('xyz') >> handle and: resp.buildId >> 'bd-5678' resp.cached >> false @@ -1371,7 +1370,7 @@ class WaveClientTest extends Specification { when: ready = wave.isContainerReady('xyz') then: - cache.getIfPresent('xyz') >> handle + cache.get('xyz') >> handle and: resp.requestId >> null resp.buildId >> 'bd-5678' @@ -1386,7 +1385,7 @@ class WaveClientTest extends Specification { when: ready = wave.isContainerReady('xyz') then: - cache.getIfPresent('xyz') >> handle + cache.get('xyz') >> handle and: resp.requestId >> null resp.buildId >> 'bd-5678' From 62c565a86a730af2ab60d6aca82da11cfbf55933 Mon Sep 17 00:00:00 2001 From: Adam Talbot <12817534+adamrtalbot@users.noreply.github.com> Date: Fri, 29 Nov 2024 18:47:38 +0000 Subject: [PATCH 10/22] Incorrect CPU value in Azure example (#5549) [ci skip] Signed-off-by: Adam Talbot <12817534+adamrtalbot@users.noreply.github.com> Signed-off-by: adamrtalbot <12817534+adamrtalbot@users.noreply.github.com> Signed-off-by: Christopher Hakkaart --- docs/azure.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/azure.md b/docs/azure.md index 6087c18ec2..61269e43ac 100644 --- a/docs/azure.md +++ b/docs/azure.md @@ -167,12 +167,12 @@ To specify multiple Azure machine families, use a comma separated list with glob process.machineType = "Standard_D*d_v5,Standard_E*d_v5" ``` -For example, the following process will create a pool of `Standard_E4d_v5` machines based when using `autoPoolMode`: +For example, the following process will create a pool of `Standard_E8d_v5` machines based when using `autoPoolMode`: ```nextflow process EXAMPLE_PROCESS { machineType "Standard_E*d_v5" - cpus 16 + cpus 8 memory 8.GB script: From c0d98d98ff3f8da357262cc92daa34b4170b6c28 Mon Sep 17 00:00:00 2001 From: Paolo Di Tommaso Date: Wed, 27 Nov 2024 22:30:30 +0100 Subject: [PATCH 11/22] Update changelog [ci skip] Signed-off-by: Paolo Di Tommaso Signed-off-by: Christopher Hakkaart --- changelog.txt | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/changelog.txt b/changelog.txt index 3420a4041e..0be1d17adf 100644 --- a/changelog.txt +++ b/changelog.txt @@ -1,5 +1,20 @@ NEXTFLOW CHANGE-LOG =================== +24.10.2 - 27 Nov 2024 +- Prevent NPE with null AWS Batch response [3d491934] +- Fix overlapping conda lock file (#5540) [df66deaa] +- Fix missing wave response (#5547) [eb85cda8] +- Bump nf-wave@1.7.4 [93d09404] +- Bump nf-amazon@2.9.2 [469a35dd] + +24.10.1 - 18 Nov 2024 +- Fix overlapping file lock exception (#5489) [a2566d54] +- Fix isContainerReady when wave is disabled (#5509) [c69e3711] +- Bump nf-wave@1.7.3 [e7709a0f] +- Bump nf-azure@1.10.2 [54496ac4] +- Bump nf-amazon@2.9.1 [fa227933] +- Bump netty-common to version 4.1.115.Final [90623c1e] + 24.10.0 - 27 Oct 2024 - Add `manifest.contributors` config option (#5322) [cf0f9690] - Add wave mirror and scan config [92e69776] From 2794d3e01b3e8f89ba1cfed5bd5def0b34136520 Mon Sep 17 00:00:00 2001 From: Jorge Ejarque Date: Mon, 2 Dec 2024 18:42:00 +0100 Subject: [PATCH 12/22] Detecting errors in data unstaging (#5345) Signed-off-by: jorgee Signed-off-by: Paolo Di Tommaso Co-authored-by: Paolo Di Tommaso Signed-off-by: Christopher Hakkaart --- .../executor/SimpleFileCopyStrategy.groovy | 2 +- .../nextflow/executor/command-run.txt | 27 ++++++++++++++++--- .../executor/BashWrapperBuilderTest.groovy | 4 +-- .../SimpleFileCopyStrategyTest.groovy | 6 ++--- .../executor/test-bash-wrapper-with-trace.txt | 20 +++++++++++--- .../nextflow/executor/test-bash-wrapper.txt | 20 +++++++++++--- .../BashWrapperBuilderWithS3Test.groovy | 2 +- .../BashWrapperBuilderWithAzTest.groovy | 2 +- .../GoogleLifeSciencesHelper.groovy | 2 +- .../GoogleLifeSciencesHelperTest.groovy | 2 +- .../google/lifesciences/bash-wrapper-gcp.txt | 20 +++++++++++--- validation/awsbatch-unstage-fail.config | 12 +++++++++ validation/awsbatch.sh | 9 ++++++- .../Dockerfile | 11 ++++++++ .../fake_aws/bin/aws | 9 +++++++ validation/test-aws-unstage-fail.nf | 16 +++++++++++ 16 files changed, 140 insertions(+), 24 deletions(-) create mode 100644 validation/awsbatch-unstage-fail.config create mode 100644 validation/test-aws-unstage-fail-container/Dockerfile create mode 100755 validation/test-aws-unstage-fail-container/fake_aws/bin/aws create mode 100644 validation/test-aws-unstage-fail.nf diff --git a/modules/nextflow/src/main/groovy/nextflow/executor/SimpleFileCopyStrategy.groovy b/modules/nextflow/src/main/groovy/nextflow/executor/SimpleFileCopyStrategy.groovy index e1a92b172c..6d9dcbd538 100644 --- a/modules/nextflow/src/main/groovy/nextflow/executor/SimpleFileCopyStrategy.groovy +++ b/modules/nextflow/src/main/groovy/nextflow/executor/SimpleFileCopyStrategy.groovy @@ -183,7 +183,7 @@ class SimpleFileCopyStrategy implements ScriptFileCopyStrategy { return """\ IFS=\$'\\n' for name in \$(eval "ls -1d ${escape.join(' ')}" | sort | uniq); do - ${stageOutCommand('$name', targetDir, mode)} || true + ${stageOutCommand('$name', targetDir, mode)} done unset IFS""".stripIndent(true) } diff --git a/modules/nextflow/src/main/resources/nextflow/executor/command-run.txt b/modules/nextflow/src/main/resources/nextflow/executor/command-run.txt index 2bf34b617a..26cbf6a829 100644 --- a/modules/nextflow/src/main/resources/nextflow/executor/command-run.txt +++ b/modules/nextflow/src/main/resources/nextflow/executor/command-run.txt @@ -99,7 +99,13 @@ nxf_fs_fcp() { } on_exit() { - exit_status=${nxf_main_ret:=$?} + ## Capture possible errors. + ## Can be caused either by the task script, unstage script or after script if defined + local last_err=$? + ## capture the task error first or fallback to unstage error + local exit_status=${nxf_main_ret:=0} + [[ ${exit_status} -eq 0 && ${nxf_unstage_ret:=0} -ne 0 ]] && exit_status=${nxf_unstage_ret:=0} + [[ ${exit_status} -eq 0 && ${last_err} -ne 0 ]] && exit_status=${last_err} printf -- $exit_status {{exit_file}} set +u {{cleanup_cmd}} @@ -121,13 +127,26 @@ nxf_stage() { {{stage_inputs}} } -nxf_unstage() { +nxf_unstage_outputs() { true - {{unstage_controls}} - [[ ${nxf_main_ret:=0} != 0 ]] && return {{unstage_outputs}} } +nxf_unstage_controls() { + true + {{unstage_controls}} +} + +nxf_unstage() { + ## Deactivate fast failure to allow uploading stdout and stderr files later + if [[ ${nxf_main_ret:=0} == 0 ]]; then + ## Data unstaging redirecting stdout and stderr with append mode + (set -e -o pipefail; (nxf_unstage_outputs | tee -a {{stdout_file}}) 3>&1 1>&2 2>&3 | tee -a {{stderr_file}}) + nxf_unstage_ret=$? + fi + nxf_unstage_controls +} + nxf_main() { trap on_exit EXIT trap on_term TERM INT USR2 diff --git a/modules/nextflow/src/test/groovy/nextflow/executor/BashWrapperBuilderTest.groovy b/modules/nextflow/src/test/groovy/nextflow/executor/BashWrapperBuilderTest.groovy index 5df66b6369..e54bd97d72 100644 --- a/modules/nextflow/src/test/groovy/nextflow/executor/BashWrapperBuilderTest.groovy +++ b/modules/nextflow/src/test/groovy/nextflow/executor/BashWrapperBuilderTest.groovy @@ -559,7 +559,7 @@ class BashWrapperBuilderTest extends Specification { binding.unstage_outputs == '''\ IFS=$'\\n' for name in $(eval "ls -1d test.bam test.bai" | sort | uniq); do - nxf_fs_copy "$name" /work/dir || true + nxf_fs_copy "$name" /work/dir done unset IFS '''.stripIndent().rightTrim() @@ -576,7 +576,7 @@ class BashWrapperBuilderTest extends Specification { binding.unstage_outputs == '''\ IFS=$'\\n' for name in $(eval "ls -1d test.bam test.bai" | sort | uniq); do - nxf_fs_move "$name" /another/dir || true + nxf_fs_move "$name" /another/dir done unset IFS '''.stripIndent().rightTrim() diff --git a/modules/nextflow/src/test/groovy/nextflow/executor/SimpleFileCopyStrategyTest.groovy b/modules/nextflow/src/test/groovy/nextflow/executor/SimpleFileCopyStrategyTest.groovy index 29cbb35697..6361e4f394 100644 --- a/modules/nextflow/src/test/groovy/nextflow/executor/SimpleFileCopyStrategyTest.groovy +++ b/modules/nextflow/src/test/groovy/nextflow/executor/SimpleFileCopyStrategyTest.groovy @@ -270,7 +270,7 @@ class SimpleFileCopyStrategyTest extends Specification { script == ''' IFS=$'\\n' for name in $(eval "ls -1d simple.txt my/path/file.bam" | sort | uniq); do - nxf_fs_copy "$name" /target/work\\ dir || true + nxf_fs_copy "$name" /target/work\\ dir done unset IFS ''' @@ -293,7 +293,7 @@ class SimpleFileCopyStrategyTest extends Specification { script == ''' IFS=$'\\n' for name in $(eval "ls -1d simple.txt my/path/file.bam" | sort | uniq); do - nxf_fs_move "$name" /target/store || true + nxf_fs_move "$name" /target/store done unset IFS ''' @@ -315,7 +315,7 @@ class SimpleFileCopyStrategyTest extends Specification { script == ''' IFS=$'\\n' for name in $(eval "ls -1d simple.txt my/path/file.bam" | sort | uniq); do - nxf_fs_rsync "$name" /target/work\\'s || true + nxf_fs_rsync "$name" /target/work\\'s done unset IFS ''' diff --git a/modules/nextflow/src/test/resources/nextflow/executor/test-bash-wrapper-with-trace.txt b/modules/nextflow/src/test/resources/nextflow/executor/test-bash-wrapper-with-trace.txt index ef1380e7cf..1de9614e11 100644 --- a/modules/nextflow/src/test/resources/nextflow/executor/test-bash-wrapper-with-trace.txt +++ b/modules/nextflow/src/test/resources/nextflow/executor/test-bash-wrapper-with-trace.txt @@ -270,7 +270,10 @@ nxf_fs_fcp() { } on_exit() { - exit_status=${nxf_main_ret:=$?} + local last_err=$? + local exit_status=${nxf_main_ret:=0} + [[ ${exit_status} -eq 0 && ${nxf_unstage_ret:=0} -ne 0 ]] && exit_status=${nxf_unstage_ret:=0} + [[ ${exit_status} -eq 0 && ${last_err} -ne 0 ]] && exit_status=${last_err} printf -- $exit_status > {{folder}}/.exitcode set +u exit $exit_status @@ -289,9 +292,20 @@ nxf_stage() { true } -nxf_unstage() { +nxf_unstage_outputs() { + true +} + +nxf_unstage_controls() { true - [[ ${nxf_main_ret:=0} != 0 ]] && return +} + +nxf_unstage() { + if [[ ${nxf_main_ret:=0} == 0 ]]; then + (set -e -o pipefail; (nxf_unstage_outputs | tee -a .command.out) 3>&1 1>&2 2>&3 | tee -a .command.err) + nxf_unstage_ret=$? + fi + nxf_unstage_controls } nxf_main() { diff --git a/modules/nextflow/src/test/resources/nextflow/executor/test-bash-wrapper.txt b/modules/nextflow/src/test/resources/nextflow/executor/test-bash-wrapper.txt index f465b5b9b4..3bb4f34fe5 100644 --- a/modules/nextflow/src/test/resources/nextflow/executor/test-bash-wrapper.txt +++ b/modules/nextflow/src/test/resources/nextflow/executor/test-bash-wrapper.txt @@ -81,7 +81,10 @@ nxf_fs_fcp() { } on_exit() { - exit_status=${nxf_main_ret:=$?} + local last_err=$? + local exit_status=${nxf_main_ret:=0} + [[ ${exit_status} -eq 0 && ${nxf_unstage_ret:=0} -ne 0 ]] && exit_status=${nxf_unstage_ret:=0} + [[ ${exit_status} -eq 0 && ${last_err} -ne 0 ]] && exit_status=${last_err} printf -- $exit_status > {{folder}}/.exitcode set +u exit $exit_status @@ -100,9 +103,20 @@ nxf_stage() { true } -nxf_unstage() { +nxf_unstage_outputs() { + true +} + +nxf_unstage_controls() { true - [[ ${nxf_main_ret:=0} != 0 ]] && return +} + +nxf_unstage() { + if [[ ${nxf_main_ret:=0} == 0 ]]; then + (set -e -o pipefail; (nxf_unstage_outputs | tee -a .command.out) 3>&1 1>&2 2>&3 | tee -a .command.err) + nxf_unstage_ret=$? + fi + nxf_unstage_controls } nxf_main() { diff --git a/plugins/nf-amazon/src/test/nextflow/executor/BashWrapperBuilderWithS3Test.groovy b/plugins/nf-amazon/src/test/nextflow/executor/BashWrapperBuilderWithS3Test.groovy index 3e213444cb..4f90e22aa2 100644 --- a/plugins/nf-amazon/src/test/nextflow/executor/BashWrapperBuilderWithS3Test.groovy +++ b/plugins/nf-amazon/src/test/nextflow/executor/BashWrapperBuilderWithS3Test.groovy @@ -58,7 +58,7 @@ class BashWrapperBuilderWithS3Test extends Specification { binding.unstage_outputs == '''\ IFS=$'\\n' for name in $(eval "ls -1d test.bam test.bai bla\\ nk.txt" | sort | uniq); do - nxf_s3_upload $name s3://some/buck\\ et || true + nxf_s3_upload $name s3://some/buck\\ et done unset IFS '''.stripIndent().rightTrim() diff --git a/plugins/nf-azure/src/test/nextflow/executor/BashWrapperBuilderWithAzTest.groovy b/plugins/nf-azure/src/test/nextflow/executor/BashWrapperBuilderWithAzTest.groovy index eb72d25163..1969345af5 100644 --- a/plugins/nf-azure/src/test/nextflow/executor/BashWrapperBuilderWithAzTest.groovy +++ b/plugins/nf-azure/src/test/nextflow/executor/BashWrapperBuilderWithAzTest.groovy @@ -47,7 +47,7 @@ class BashWrapperBuilderWithAzTest extends Specification { binding.unstage_outputs == """\ IFS=\$'\\n' for name in \$(eval "ls -1d test.bam test.bai" | sort | uniq); do - nxf_az_upload \$name '${AzHelper.toHttpUrl(target)}' || true + nxf_az_upload \$name '${AzHelper.toHttpUrl(target)}' done unset IFS """.stripIndent().rightTrim() diff --git a/plugins/nf-google/src/main/nextflow/cloud/google/lifesciences/GoogleLifeSciencesHelper.groovy b/plugins/nf-google/src/main/nextflow/cloud/google/lifesciences/GoogleLifeSciencesHelper.groovy index be08d069e3..7352770b32 100644 --- a/plugins/nf-google/src/main/nextflow/cloud/google/lifesciences/GoogleLifeSciencesHelper.groovy +++ b/plugins/nf-google/src/main/nextflow/cloud/google/lifesciences/GoogleLifeSciencesHelper.groovy @@ -365,7 +365,7 @@ class GoogleLifeSciencesHelper { final remoteTaskDir = getRemoteTaskDir(workDir) def result = 'set -x; ' result += "trap 'err=\$?; exec 1>&2; gsutil -m -q cp -R $localTaskDir/${TaskRun.CMD_LOG} ${remoteTaskDir}/${TaskRun.CMD_LOG} || true; [[ \$err -gt 0 || \$GOOGLE_LAST_EXIT_STATUS -gt 0 || \$NXF_DEBUG -gt 0 ]] && { ls -lah $localTaskDir || true; gsutil -m -q cp -R /google/ ${remoteTaskDir}; } || rm -rf $localTaskDir; exit \$err' EXIT; " - result += "{ cd $localTaskDir; bash ${TaskRun.CMD_RUN} nxf_unstage; } >> $localTaskDir/${TaskRun.CMD_LOG} 2>&1" + result += "{ cd $localTaskDir; bash ${TaskRun.CMD_RUN} nxf_unstage;} >> $localTaskDir/${TaskRun.CMD_LOG} 2>&1" return result } diff --git a/plugins/nf-google/src/test/nextflow/cloud/google/lifesciences/GoogleLifeSciencesHelperTest.groovy b/plugins/nf-google/src/test/nextflow/cloud/google/lifesciences/GoogleLifeSciencesHelperTest.groovy index 35cda62f0b..9db824a902 100644 --- a/plugins/nf-google/src/test/nextflow/cloud/google/lifesciences/GoogleLifeSciencesHelperTest.groovy +++ b/plugins/nf-google/src/test/nextflow/cloud/google/lifesciences/GoogleLifeSciencesHelperTest.groovy @@ -548,7 +548,7 @@ class GoogleLifeSciencesHelperTest extends GoogleSpecification { def unstage = helper.getUnstagingScript(dir) then: unstage == - 'set -x; trap \'err=$?; exec 1>&2; gsutil -m -q cp -R /work/dir/.command.log gs://my-bucket/work/dir/.command.log || true; [[ $err -gt 0 || $GOOGLE_LAST_EXIT_STATUS -gt 0 || $NXF_DEBUG -gt 0 ]] && { ls -lah /work/dir || true; gsutil -m -q cp -R /google/ gs://my-bucket/work/dir; } || rm -rf /work/dir; exit $err\' EXIT; { cd /work/dir; bash .command.run nxf_unstage; } >> /work/dir/.command.log 2>&1' + 'set -x; trap \'err=$?; exec 1>&2; gsutil -m -q cp -R /work/dir/.command.log gs://my-bucket/work/dir/.command.log || true; [[ $err -gt 0 || $GOOGLE_LAST_EXIT_STATUS -gt 0 || $NXF_DEBUG -gt 0 ]] && { ls -lah /work/dir || true; gsutil -m -q cp -R /google/ gs://my-bucket/work/dir; } || rm -rf /work/dir; exit $err\' EXIT; { cd /work/dir; bash .command.run nxf_unstage;} >> /work/dir/.command.log 2>&1' } @Unroll diff --git a/plugins/nf-google/src/test/nextflow/cloud/google/lifesciences/bash-wrapper-gcp.txt b/plugins/nf-google/src/test/nextflow/cloud/google/lifesciences/bash-wrapper-gcp.txt index 70a68452aa..c7382062a1 100644 --- a/plugins/nf-google/src/test/nextflow/cloud/google/lifesciences/bash-wrapper-gcp.txt +++ b/plugins/nf-google/src/test/nextflow/cloud/google/lifesciences/bash-wrapper-gcp.txt @@ -168,7 +168,10 @@ nxf_fs_fcp() { } on_exit() { - exit_status=${nxf_main_ret:=$?} + local last_err=$? + local exit_status=${nxf_main_ret:=0} + [[ ${exit_status} -eq 0 && ${nxf_unstage_ret:=0} -ne 0 ]] && exit_status=${nxf_unstage_ret:=0} + [[ ${exit_status} -eq 0 && ${last_err} -ne 0 ]] && exit_status=${last_err} printf -- $exit_status > {{folder}}/.exitcode set +u exit $exit_status @@ -192,12 +195,23 @@ nxf_stage() { nxf_parallel "${downloads[@]}" } -nxf_unstage() { +nxf_unstage_outputs() { + true +} + +nxf_unstage_controls() { true gsutil -m -q cp -R .command.out gs://bucket/work/dir/.command.out || true gsutil -m -q cp -R .command.err gs://bucket/work/dir/.command.err || true gsutil -m -q cp -R .exitcode gs://bucket/work/dir/.exitcode || true - [[ ${nxf_main_ret:=0} != 0 ]] && return +} + +nxf_unstage() { + if [[ ${nxf_main_ret:=0} == 0 ]]; then + (set -e -o pipefail; (nxf_unstage_outputs | tee -a .command.out) 3>&1 1>&2 2>&3 | tee -a .command.err) + nxf_unstage_ret=$? + fi + nxf_unstage_controls } nxf_main() { diff --git a/validation/awsbatch-unstage-fail.config b/validation/awsbatch-unstage-fail.config new file mode 100644 index 0000000000..81b96579d7 --- /dev/null +++ b/validation/awsbatch-unstage-fail.config @@ -0,0 +1,12 @@ +/* + * do not include plugin requirements otherwise latest + * published version will be downloaded instead of using local build + */ + +workDir = 's3://nextflow-ci/work' +process.executor = 'awsbatch' +process.queue = 'nextflow-ci' +process.container = 'quay.io/nextflow/test-aws-unstage-fail:1.0' +aws.region = 'eu-west-1' +aws.batch.maxTransferAttempts = 3 +aws.batch.delayBetweenAttempts = '5 sec' diff --git a/validation/awsbatch.sh b/validation/awsbatch.sh index d58727e7e8..b73571cbd6 100644 --- a/validation/awsbatch.sh +++ b/validation/awsbatch.sh @@ -7,6 +7,13 @@ get_abs_filename() { export NXF_CMD=${NXF_CMD:-$(get_abs_filename ../launch.sh)} +# Execution should fail ignoring +$NXF_CMD run test-aws-unstage-fail.nf -c awsbatch-unstage-fail.config || true +[[ `grep -c "Error executing process > 'test (1)'" .nextflow.log` == 1 ]] || false +[[ `grep -c " Essential container in task exited" .nextflow.log` == 1 ]] || false +[[ `grep -cozP "Command exit status:\n 1" .nextflow.log` == 1 ]] || false +[[ `grep -c "Producing a failure in aws" .nextflow.log` == 2 ]] || false + $NXF_CMD run test-complexpaths.nf -c awsbatch.config [[ -d foo ]] || false [[ -e 'foo/.alpha' ]] || false @@ -73,4 +80,4 @@ $NXF_CMD run nextflow-io/hello \ -process.array 10 \ -with-wave \ -with-fusion \ - -c awsbatch.config \ No newline at end of file + -c awsbatch.config diff --git a/validation/test-aws-unstage-fail-container/Dockerfile b/validation/test-aws-unstage-fail-container/Dockerfile new file mode 100644 index 0000000000..0dd281ba58 --- /dev/null +++ b/validation/test-aws-unstage-fail-container/Dockerfile @@ -0,0 +1,11 @@ +FROM ubuntu + +RUN apt-get update && apt-get -y install curl unzip && apt-get clean + + +RUN curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip" && \ + unzip awscliv2.zip && ./aws/install && rm -rf aws* + +ADD fake_aws /fake_aws + +ENV PATH=/fake_aws/bin/:$PATH diff --git a/validation/test-aws-unstage-fail-container/fake_aws/bin/aws b/validation/test-aws-unstage-fail-container/fake_aws/bin/aws new file mode 100755 index 0000000000..80985d9e08 --- /dev/null +++ b/validation/test-aws-unstage-fail-container/fake_aws/bin/aws @@ -0,0 +1,9 @@ +#!/bin/bash + +if [[ "$*" == *".command."* ]] || [[ "$*" == *".exitcode"* ]]; then + /usr/local/bin/aws $@ +else + >&2 echo "Producing a failure in aws $@" + exit 2 +fi + diff --git a/validation/test-aws-unstage-fail.nf b/validation/test-aws-unstage-fail.nf new file mode 100644 index 0000000000..96bcb9af1e --- /dev/null +++ b/validation/test-aws-unstage-fail.nf @@ -0,0 +1,16 @@ +process test { + input: + val i + output: + file("test${i}") + file("test_2_${i}") + script: + """ + dd if=/dev/urandom of=test${i} bs=1K count=90 + dd if=/dev/urandom of=test_2_${i} bs=1K count=90 + """ +} + +workflow { + Channel.of(1) | test +} From a91fd9dd786d917b7ad3946f2f08e094188a2400 Mon Sep 17 00:00:00 2001 From: Paolo Di Tommaso Date: Tue, 3 Dec 2024 07:14:09 +0000 Subject: [PATCH 13/22] Bump nf-amazon@2.10.0 Signed-off-by: Paolo Di Tommaso Signed-off-by: Christopher Hakkaart --- plugins/nf-amazon/changelog.txt | 8 ++++++++ plugins/nf-amazon/src/resources/META-INF/MANIFEST.MF | 2 +- 2 files changed, 9 insertions(+), 1 deletion(-) diff --git a/plugins/nf-amazon/changelog.txt b/plugins/nf-amazon/changelog.txt index 387d69476d..d9fc1213e9 100644 --- a/plugins/nf-amazon/changelog.txt +++ b/plugins/nf-amazon/changelog.txt @@ -1,5 +1,13 @@ nf-amazon changelog =================== +2.10.0 - 3 Dec 2024 +- Detecting errors in data unstaging (#5345) [3c8e602d] +- Prevent NPE with null AWS Batch response [12fc1d60] +- Fix Fargate warning on memory check (#5475) [bdf0ad00] +- Bump groovy 4.0.24 [dd71ad31] +- Bump aws sdk 1.12.777 (#5458) [8bad0b4b] +- Bump netty-common to version 4.1.115.Final [d1bbd3d0] + 2.9.0 - 2 Oct 2024 - Add Platform workflow prefix in AWS Batch job names (#5318) [e2e test] [42dd4ba8] - Fix AWS spot attempts with zero value (#5331) [ci fast] [bac2da12] diff --git a/plugins/nf-amazon/src/resources/META-INF/MANIFEST.MF b/plugins/nf-amazon/src/resources/META-INF/MANIFEST.MF index 41ac624f51..7772ceb96b 100644 --- a/plugins/nf-amazon/src/resources/META-INF/MANIFEST.MF +++ b/plugins/nf-amazon/src/resources/META-INF/MANIFEST.MF @@ -1,6 +1,6 @@ Manifest-Version: 1.0 Plugin-Class: nextflow.cloud.aws.AmazonPlugin Plugin-Id: nf-amazon -Plugin-Version: 2.9.0 +Plugin-Version: 2.10.0 Plugin-Provider: Seqera Labs Plugin-Requires: >=24.04.4 From 3b810d0f748c359e41679e371c6d6c75f604e040 Mon Sep 17 00:00:00 2001 From: Paolo Di Tommaso Date: Tue, 3 Dec 2024 07:15:13 +0000 Subject: [PATCH 14/22] Bump nf-azure@1.11.0 Signed-off-by: Paolo Di Tommaso Signed-off-by: Christopher Hakkaart --- plugins/nf-azure/changelog.txt | 7 +++++++ plugins/nf-azure/src/resources/META-INF/MANIFEST.MF | 2 +- 2 files changed, 8 insertions(+), 1 deletion(-) diff --git a/plugins/nf-azure/changelog.txt b/plugins/nf-azure/changelog.txt index 91afbac22d..780fda1082 100644 --- a/plugins/nf-azure/changelog.txt +++ b/plugins/nf-azure/changelog.txt @@ -1,5 +1,12 @@ nf-azure changelog =================== +1.11.0 - 3 Dec 2024 +- Detecting errors in data unstaging (#5345) [3c8e602d] +- Bump netty-common to version 4.1.115.Final [d1bbd3d0] +- Bump groovy 4.0.24 [dd71ad31] +- Bump com.azure:azure-identity from 1.11.3 to 1.12.2 (#5449) [cb70f1df] +- Target Java 17 as minimal Java version (#5045) [0140f954] + 1.10.1 - 27 Oct 2024 - Demote azure batch task status log level to trace (#5416) [ci skip] [d6c684bb] diff --git a/plugins/nf-azure/src/resources/META-INF/MANIFEST.MF b/plugins/nf-azure/src/resources/META-INF/MANIFEST.MF index 2918b09d26..1ebcbf274f 100644 --- a/plugins/nf-azure/src/resources/META-INF/MANIFEST.MF +++ b/plugins/nf-azure/src/resources/META-INF/MANIFEST.MF @@ -1,6 +1,6 @@ Manifest-Version: 1.0 Plugin-Class: nextflow.cloud.azure.AzurePlugin Plugin-Id: nf-azure -Plugin-Version: 1.10.1 +Plugin-Version: 1.11.0 Plugin-Provider: Seqera Labs Plugin-Requires: >=24.04.4 From eff621e686563b360458eb5efdce4ac37a8c5986 Mon Sep 17 00:00:00 2001 From: Paolo Di Tommaso Date: Tue, 3 Dec 2024 07:16:50 +0000 Subject: [PATCH 15/22] Bump nf-google@1.16.0 Signed-off-by: Paolo Di Tommaso Signed-off-by: Christopher Hakkaart --- plugins/nf-google/changelog.txt | 7 +++++++ plugins/nf-google/src/resources/META-INF/MANIFEST.MF | 2 +- 2 files changed, 8 insertions(+), 1 deletion(-) diff --git a/plugins/nf-google/changelog.txt b/plugins/nf-google/changelog.txt index 9d4cade5e7..f7d23811e6 100644 --- a/plugins/nf-google/changelog.txt +++ b/plugins/nf-google/changelog.txt @@ -1,5 +1,12 @@ nf-google changelog =================== +1.16.0 - 3 Dec 2024 +- Detecting errors in data unstaging (#5345) [3c8e602d] +- Bump bouncycastle to jdk18on:1.78.1 (#5467) [cd8c385f] +- Bump groovy 4.0.24 [dd71ad31] +- Bump protobuf-java:3.25.5 to nf-google [488b7906] +- Add NotFoundException to retry condition for Google Batch [aa4d19cc] + 1.15.2 - 14 Oct 2024 - Add Google LS deprecation notice (#5400) [0ee1d9bc] diff --git a/plugins/nf-google/src/resources/META-INF/MANIFEST.MF b/plugins/nf-google/src/resources/META-INF/MANIFEST.MF index 65849a3ba7..fb9f7deae5 100644 --- a/plugins/nf-google/src/resources/META-INF/MANIFEST.MF +++ b/plugins/nf-google/src/resources/META-INF/MANIFEST.MF @@ -1,6 +1,6 @@ Manifest-Version: 1.0 Plugin-Class: nextflow.cloud.google.GoogleCloudPlugin Plugin-Id: nf-google -Plugin-Version: 1.15.2 +Plugin-Version: 1.16.0 Plugin-Provider: Seqera Labs Plugin-Requires: >=24.04.4 From 6960eab0accce7ef58404f227ab35223a573030c Mon Sep 17 00:00:00 2001 From: Paolo Di Tommaso Date: Tue, 3 Dec 2024 07:17:20 +0000 Subject: [PATCH 16/22] Bump nf-google@1.8.0 [ci fast] Signed-off-by: Paolo Di Tommaso Signed-off-by: Christopher Hakkaart --- plugins/nf-wave/changelog.txt | 6 ++++++ plugins/nf-wave/src/resources/META-INF/MANIFEST.MF | 2 +- 2 files changed, 7 insertions(+), 1 deletion(-) diff --git a/plugins/nf-wave/changelog.txt b/plugins/nf-wave/changelog.txt index 6c9c6abf0f..1a13763b31 100644 --- a/plugins/nf-wave/changelog.txt +++ b/plugins/nf-wave/changelog.txt @@ -1,5 +1,11 @@ nf-wave changelog ================== +1.8.0 - 3 Dec 2024 +- Fix missing wave response (#5547) [ci fast] [ee252173] +- Update wave deps [09ccd295] +- Fix isContainerReady when wave is disabled (#5509) [ci fast] [3215afa8] +- Bump groovy 4.0.24 [dd71ad31] + 1.7.2 - 27 Oct 2024 - Add wave mirror vs module bundles conflicts warning [b37a8a5b] diff --git a/plugins/nf-wave/src/resources/META-INF/MANIFEST.MF b/plugins/nf-wave/src/resources/META-INF/MANIFEST.MF index 1eb76aa0db..bb42c60b97 100644 --- a/plugins/nf-wave/src/resources/META-INF/MANIFEST.MF +++ b/plugins/nf-wave/src/resources/META-INF/MANIFEST.MF @@ -1,6 +1,6 @@ Manifest-Version: 1.0 Plugin-Class: io.seqera.wave.plugin.WavePlugin Plugin-Id: nf-wave -Plugin-Version: 1.7.2 +Plugin-Version: 1.8.0 Plugin-Provider: Seqera Labs Plugin-Requires: >=24.04.4 From 20a4b6d4439f2f0ba93d0fa77d922bb63bca848f Mon Sep 17 00:00:00 2001 From: Paolo Di Tommaso Date: Tue, 3 Dec 2024 09:37:57 +0000 Subject: [PATCH 17/22] [release 24.11.0-edge] Update timestamp and build number Signed-off-by: Paolo Di Tommaso Signed-off-by: Christopher Hakkaart --- VERSION | 2 +- .../src/main/resources/META-INF/build-info.properties | 8 ++++---- .../nextflow/src/main/resources/META-INF/plugins-info.txt | 8 ++++---- nextflow | 2 +- nextflow.md5 | 2 +- nextflow.sha1 | 2 +- nextflow.sha256 | 2 +- 7 files changed, 13 insertions(+), 13 deletions(-) diff --git a/VERSION b/VERSION index 21651351e2..2b70d664e4 100644 --- a/VERSION +++ b/VERSION @@ -1 +1 @@ -24.10.0 +24.11.0-edge diff --git a/modules/nextflow/src/main/resources/META-INF/build-info.properties b/modules/nextflow/src/main/resources/META-INF/build-info.properties index d6072256a9..2053248f03 100644 --- a/modules/nextflow/src/main/resources/META-INF/build-info.properties +++ b/modules/nextflow/src/main/resources/META-INF/build-info.properties @@ -1,4 +1,4 @@ -build=5928 -version=24.10.0 -timestamp=1730054192154 -commitId=6524d8dc9 +build=5929 +version=24.11.0-edge +timestamp=1733218258400 +commitId=7e2c8d82b diff --git a/modules/nextflow/src/main/resources/META-INF/plugins-info.txt b/modules/nextflow/src/main/resources/META-INF/plugins-info.txt index 39d506b59d..c650c71066 100644 --- a/modules/nextflow/src/main/resources/META-INF/plugins-info.txt +++ b/modules/nextflow/src/main/resources/META-INF/plugins-info.txt @@ -1,8 +1,8 @@ -nf-amazon@2.9.0 -nf-azure@1.10.1 +nf-amazon@2.10.0 +nf-azure@1.11.0 nf-cloudcache@0.4.2 nf-codecommit@0.2.2 nf-console@1.1.4 -nf-google@1.15.2 +nf-google@1.16.0 nf-tower@1.9.3 -nf-wave@1.7.2 \ No newline at end of file +nf-wave@1.8.0 \ No newline at end of file diff --git a/nextflow b/nextflow index 5dbb589bdf..c89ca7617d 100755 --- a/nextflow +++ b/nextflow @@ -15,7 +15,7 @@ # limitations under the License. [[ "$NXF_DEBUG" == 'x' ]] && set -x -NXF_VER=${NXF_VER:-'24.10.0'} +NXF_VER=${NXF_VER:-'24.11.0-edge'} NXF_ORG=${NXF_ORG:-'nextflow-io'} NXF_HOME=${NXF_HOME:-$HOME/.nextflow} NXF_PROT=${NXF_PROT:-'https'} diff --git a/nextflow.md5 b/nextflow.md5 index c51c7ce709..deb7f7a40a 100644 --- a/nextflow.md5 +++ b/nextflow.md5 @@ -1 +1 @@ -7dfd8066370310bff610aa209c988b3e +66995c4139ebcd17bf99f17d9dd030d1 diff --git a/nextflow.sha1 b/nextflow.sha1 index 80cc138bfd..ccadbdc25f 100644 --- a/nextflow.sha1 +++ b/nextflow.sha1 @@ -1 +1 @@ -c142828a82fa6678a5af978d3545bb7f56072be6 +cdbb67bdb21c0e63fb48aabb8b168c12a31fa5b3 diff --git a/nextflow.sha256 b/nextflow.sha256 index e9a09bcc10..f4e9ccb3bd 100644 --- a/nextflow.sha256 +++ b/nextflow.sha256 @@ -1 +1 @@ -e848918fb9b85762822c078435d9ff71979a88cccff81ce5babd75d5eee52fe6 +69a86852c52dcfa7662407c46d16f05bd3dec16e0c505c2a2f71ccc56219d631 From a63683132b2e09b4fa9639b0dd1864b27a075b2f Mon Sep 17 00:00:00 2001 From: Paolo Di Tommaso Date: Tue, 3 Dec 2024 10:46:44 +0000 Subject: [PATCH 18/22] Update changelog [ci skip] Signed-off-by: Paolo Di Tommaso Signed-off-by: Christopher Hakkaart --- changelog.txt | 41 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 41 insertions(+) diff --git a/changelog.txt b/changelog.txt index 0be1d17adf..54fba39ad4 100644 --- a/changelog.txt +++ b/changelog.txt @@ -1,5 +1,46 @@ NEXTFLOW CHANGE-LOG =================== +24.11.0-edge - 3 Dec 2024 +- Add GHA to submit dependencies to dependabot (#5440) [80395a6d] +- Add NotFoundException to retry condition for Google Batch [aa4d19cc] +- Add Rahel Hirsch to run name generator (#5442) [ff2bc6ae] +- Add `env()` function (#5506) [fa0e8e0f] +- Add more scientists to run name generator (#5447) [38d9eda0] +- Add `singularity.libraryDir` to config page (#5498) [b5e31bb0] +- Add RepositoryProvider.revision now public property (#5500) [f0a4c526] +- Deprecate process `shell` block (#5508) [6f527551] +- Detecting errors in data unstaging (#5345) [3c8e602d] +- Disable virtual threads on CI tests [ci slip] [69d07dbc] +- Fix Fargate warning on memory check (#5475) [bdf0ad00] +- Fix `isContainerReady` when wave is disabled (#5509) [3215afa8] +- Fix missing wave response (#5547) [ee252173] +- Fix overlapping conda lock file (#5540) [9248c04d] +- Fix overlapping conda lock exception (#5489) [eaaeb3de] +- Fix possible deadlock in dynamic `maxRetry` resolution (#5474) [25bbb621] +- Fix Wave GCP integration test (#5490) [ad56c89b] +- Fixing bug when execution with stub and no stub defined (#5473) [f7fd56db] +- Fix Incorrect CPU value in Azure example (#5549) [fc5e2c2a] +- Improve docs for using the GPU accelerator directive (#5488) [4b908524] +- Improve groupTuple docs with scatter/gather example (#5520) [b5c63a9f] +- Prevent NPE with null AWS Batch response [12fc1d60] +- Target Java 17 as minimal Java version (#5045) [0140f954] +- Update 'nexus-staging' plugin to latest version (#5462) [07934513] +- Update gradle 'shadow' plugin version to 8.3.5 (#5463) [2a5f15f0] +- Update install docs to reflect change from 'all' to 'dist' (#5496) [c9115659] +- Update process snippets to comply with strict syntax (#5526) [be1694bf] +- Update Wave dependencies [09ccd295] +- Bump aws sdk 1.12.777 (#5458) [8bad0b4b] +- Bump bouncycastle to jdk18on:1.78.1 (#5467) [cd8c385f] +- Bump com.azure:azure-identity from 1.11.3 to 1.12.2 (#5449) [cb70f1df] +- Bump commons-io:2.15.1 [767e4a0a] +- Bump groovy 4.0.24 [dd71ad31] +- Bump netty-common to version 4.1.115.Final [d1bbd3d0] +- Bump nf-amazon@2.10.0 [2b653b07] +- Bump nf-azure@1.11.0 [6af7198d] +- Bump nf-google@1.16.0 [9494f970] +- Bump nf-google@1.8.0 [7e2c8d82] +- Bump protobuf-java:3.25.5 to nf-google [488b7906] + 24.10.2 - 27 Nov 2024 - Prevent NPE with null AWS Batch response [3d491934] - Fix overlapping conda lock file (#5540) [df66deaa] From 8aa2f4a2745efd02851acb345c71a34e285db613 Mon Sep 17 00:00:00 2001 From: Christopher Hakkaart Date: Tue, 3 Dec 2024 16:18:34 +0100 Subject: [PATCH 19/22] Trying new layout Signed-off-by: Christopher Hakkaart --- docs/module.md | 40 +++++++++++------------------------ docs/process.md | 53 +++++++++++++++++++++++++++++++++++++++++++---- docs/structure.md | 50 +++++--------------------------------------- 3 files changed, 66 insertions(+), 77 deletions(-) diff --git a/docs/module.md b/docs/module.md index ae6aa03028..38bb737d3c 100644 --- a/docs/module.md +++ b/docs/module.md @@ -184,41 +184,25 @@ Ciao world! ## Module templates -Process script {ref}`templates ` can be included alongside a module in the `templates` directory. - -For example, Project L contains a module (`myModules.nf`) that defines two processes, P1 and P2. Both processes use templates that are available in the local `templates` directory: - -``` -Project L -|── myModules.nf -└── templates - |── P1-template.sh - └── P2-template.sh -``` - -Project A contains a workflow that includes processes P1 and P2: +Template files can be stored in the `templates` directory alongside a module. ``` Project A -└── main.nf -``` - -Project B contains a workflow that also includes process P1 and P2: - -``` -Project B -└── main.nf +├── main.nf +└── modules + └── sayhello + ├── sayhello.nf + └── templates + └── sayhello.py ``` -As the template files are stored with the modules inside the Project L, Projects A and B can include them without any changing any code. Future projects would also be able to include these modules by cloning Project L and including its module (if they were not available on the system). +Template files can be invoked like regular scripts from a process in your pipeline using the `template` function. Variables prefixed with the dollar character (`$`) are interpreted as Nextflow variables when the template script is executed by Nextflow. -Beyond facilitating module sharing across pipelines, keeping the module template within the script path has several advantages, including: +See {ref}`process-template` for more information utilizing template files. -- Modules are self-contained. -- Modules can be tested independently from the pipeline(s) that import them. -- Modules can be made into libraries. +Storing template files with the module that utilizes it encourages sharing of modules across pipelines. For example, future projects would be able to include the module from above by cloning the modules directory and including the module without needing to modify the process or template. -Organizing templates locations allows for a well-structured project. In projects with multiple modules that rely on templates, you can organize module scripts and their corresponding templates into logical groups. For example: +Beyond facilitating module sharing across pipelines, organizing templates locations allows for a well-structured project. For example, complex projects with multiple modules that rely on templates can be organized into logical groups: ``` baseDir @@ -243,7 +227,7 @@ baseDir └── P6-template.sh ``` -See {ref}`process-template` for more information about how to externalize process scripts to template files. +Template files can also be stored in a project `templates` directory. See {ref}`structure-template` for more information about the project directory structure. (module-binaries)= diff --git a/docs/process.md b/docs/process.md index a9f9d36482..9cfee9cb5c 100644 --- a/docs/process.md +++ b/docs/process.md @@ -158,14 +158,59 @@ In the above example, the process will execute one of several scripts depending (process-template)= -### Template +### Template files -Process scripts can be externalized to **template** files and reused across multiple processes. Templates can be accessed using the `template` function in the script section. For example: +Process scripts can be externalized to **template** files and reused across multiple processes. -By default, Nextflow looks for template scripts in the `templates` directory, located alongside the Nextflow script that defines the process. An absolute path can be used to specify a different template location. However, this practice is discouraged because it hinders pipeline portability. +Template files can be stored in the project or modules template directory. See {ref}`structure-templates` and {ref}`module-templates` for more information about directory structures. -Templates can be tested independently of pipeline execution. However, variables prefixed with the dollar character (`$`) are interpreted as Nextflow variables when the template script is executed by Nextflow and Bash variables when executed directly. Consider the following template script: +In template files, variables prefixed with the dollar character (`$`) are interpreted as Nextflow variables when the template script is executed by Nextflow. +``` +#!/usr/bin/env python + +print("Hello ${x}!") +``` + +Template files can be invoked like regular scripts from any process in your pipeline using the `template` function. + +``` +process sayHello { + + input: + val x + + output: + stdout + + script: + template 'sayhello.py' +} + +workflow { + Channel.of("Foo") | sayHello | view +} +``` + +:::{note} +All template variable must be defined. The pipeline will fail if a template variable is missing, regardless of where it occurs in the template. +::: + +Templates can be tested independently of pipeline execution by providing each input as an environment variable. For example: + +```bash +STR='foo' bash templates/myscript.sh +``` + +Template scripts are only recommended for Bash scripts. Languages that do not prefix variables with `$` (e.g. Python and R) can't be executed directly as a template script from the command line as variables prefixed with `$` are interpreted as Bash variables. Similarly, template variables escaped with `\$` will be interpreted as Bash variables when executed by Nextflow but not the command line. + +:::{warning} +Template variables are evaluated even if they are commented out in the template script. +::: + +:::{tip} +The best practice for using a custom script is to first embed it in the process definition and transfer it to a separate file with its own command line interface once the code matures. +::: (process-shell)= diff --git a/docs/structure.md b/docs/structure.md index da5bca404b..dbd44ea804 100644 --- a/docs/structure.md +++ b/docs/structure.md @@ -2,9 +2,11 @@ # Structure +(structure-templates)= + ## The `templates` directory -The `templates` directory in the Nextflow project root can be used to store scripts. +The `templates` directory in the Nextflow project root can be used to store template files. ``` ├── templates @@ -12,51 +14,9 @@ The `templates` directory in the Nextflow project root can be used to store scri └── main.nf ``` -It allows custom scripts to be invoked like regular scripts from any process in your pipeline using the `template` function: - -``` -process sayHello { - - input: - val x - - output: - stdout - - script: - template 'sayhello.py' -} - -workflow { - Channel.of("Foo") | sayHello | view -} -``` - -Variables prefixed with the dollar character (`$`) are interpreted as Nextflow variables when the template script is executed by Nextflow: +Template files can be invoked like regular scripts from any process in your pipeline using the `template` function. Variables prefixed with the dollar character (`$`) are interpreted as Nextflow variables when the template script is executed by Nextflow. -``` -#!/usr/bin/env python - -print("Hello ${x}!") -``` - -The pipeline will fail if a template variable is missing, regardless of where it occurs in the template. - -Templates can be tested independently of pipeline execution by providing each input as an environment variable. For example: - -```bash -STR='foo' bash templates/my_script.sh -``` - -Template scripts are only recommended for Bash scripts. Languages that do not prefix variables with `$` (e.g. Python and R) can't be executed directly as a template script from the command line as variables prefixed with `$` are interpreted as Bash variables. Similarly, template variables escaped with `\$` will be interpreted as Bash variables when executed by Nextflow but not the command line. - -:::{warning} -Template variables are evaluated even if they are commented out in the template script. -::: - -:::{tip} -The best practice for using a custom script is to first embed it in the process definition and transfer it to a separate file with its own command line interface once the code matures. -::: +See {ref}`process-template` for more information about utilizing template files. (bundling-executables)= From 0433bf4d8ef4bc2556f4175ec6adaeab3c781a09 Mon Sep 17 00:00:00 2001 From: Christopher Hakkaart Date: Tue, 3 Dec 2024 17:01:51 +0100 Subject: [PATCH 20/22] Quick improvements when reading Signed-off-by: Christopher Hakkaart --- docs/cache-and-resume.md | 2 +- docs/module.md | 23 ++++++++++++----------- docs/process.md | 19 ++++++------------- docs/structure.md | 4 ++-- 4 files changed, 21 insertions(+), 27 deletions(-) diff --git a/docs/cache-and-resume.md b/docs/cache-and-resume.md index 7184909aa2..556fed62eb 100644 --- a/docs/cache-and-resume.md +++ b/docs/cache-and-resume.md @@ -26,7 +26,7 @@ The task hash is computed from the following metadata: - Task {ref}`inputs ` - Task {ref}`script ` - Any global variables referenced in the task script -- Any {ref}`bundled scripts ` used in the task script +- Any {ref}`bundled scripts ` used in the task script - Whether the task is a {ref}`stub run ` - Task attempt diff --git a/docs/module.md b/docs/module.md index 38bb737d3c..1ce6e5282d 100644 --- a/docs/module.md +++ b/docs/module.md @@ -227,7 +227,7 @@ baseDir └── P6-template.sh ``` -Template files can also be stored in a project `templates` directory. See {ref}`structure-template` for more information about the project directory structure. +Template files can also be stored in the project `templates` directory. See {ref}`structure-template` for more information about the project directory structure. (module-binaries)= @@ -238,12 +238,6 @@ Template files can also be stored in a project `templates` directory. See {ref}` Modules can define binary scripts that are locally scoped to the processes defined by the tasks. -To use this feature, the module binaries must be enabled in your pipeline script or configuration file: - -```nextflow -nextflow.enable.moduleBinaries = true -``` - Binary scripts must be placed in the module directory named `/resources/usr/bin` and granted execution permissions. For example: ``` @@ -252,15 +246,22 @@ Binary scripts must be placed in the module directory named `/resour └── resources └── usr └── bin - |── your-module-script1.sh - └── another-module-script2.py + └── script.py +``` + +Binary scripts can be invoked like regular commands from the locally scoped module without modifying the `PATH` environment variable or using an absolute path. Each script should include a shebang to specify the interpreter and inputs should be supplied as arguments. + +To use this feature, the module binaries must be enabled in your pipeline script or configuration file: + +```nextflow +nextflow.enable.moduleBinaries = true ``` :::{note} -Module binary scripts require a local or shared file system for the pipeline work directory, or {ref}`wave-page` when using cloud-based executors. +Module binary scripts require a local or shared file system for the pipeline work directory or {ref}`wave-page` when using cloud-based executors. ::: -Scripts can also be stored at the pipeline level using the `bin` directory. See {ref}`bundling-executables` for more information. +Scripts can also be stored at the pipeline level using the `bin` directory. See {ref}`structure-bin` for more information. ## Sharing modules diff --git a/docs/process.md b/docs/process.md index 9cfee9cb5c..e00b3ffa3f 100644 --- a/docs/process.md +++ b/docs/process.md @@ -112,12 +112,7 @@ workflow { ``` :::{tip} -Use `env` to resolve the interpreter's location instead of hard-coding the interpreter path. For example: - -``` -#!/usr/bin/env python -``` - +Use `env` to resolve the interpreter's location instead of hard-coding the interpreter path. ::: ### Conditional scripts @@ -160,9 +155,7 @@ In the above example, the process will execute one of several scripts depending ### Template files -Process scripts can be externalized to **template** files and reused across multiple processes. - -Template files can be stored in the project or modules template directory. See {ref}`structure-templates` and {ref}`module-templates` for more information about directory structures. +Process scripts can be externalized to **template** files and reused across multiple processes. Template files can be stored in the project or modules template directory. See {ref}`structure-templates` and {ref}`module-templates` for more information about directory structures. In template files, variables prefixed with the dollar character (`$`) are interpreted as Nextflow variables when the template script is executed by Nextflow. @@ -204,14 +197,14 @@ STR='foo' bash templates/myscript.sh Template scripts are only recommended for Bash scripts. Languages that do not prefix variables with `$` (e.g. Python and R) can't be executed directly as a template script from the command line as variables prefixed with `$` are interpreted as Bash variables. Similarly, template variables escaped with `\$` will be interpreted as Bash variables when executed by Nextflow but not the command line. -:::{warning} -Template variables are evaluated even if they are commented out in the template script. -::: - :::{tip} The best practice for using a custom script is to first embed it in the process definition and transfer it to a separate file with its own command line interface once the code matures. ::: +:::{warning} +Template variables are evaluated even if they are commented out in the template script. +::: + (process-shell)= ### Shell diff --git a/docs/structure.md b/docs/structure.md index dbd44ea804..417cbb9da6 100644 --- a/docs/structure.md +++ b/docs/structure.md @@ -18,7 +18,7 @@ Template files can be invoked like regular scripts from any process in your pipe See {ref}`process-template` for more information about utilizing template files. -(bundling-executables)= +(structure-bin)= ## The `bin` directory @@ -30,7 +30,7 @@ The `bin` directory in the Nextflow project root can be used to store executable └── main.nf ``` -It allows custom scripts to be invoked like regular commands from any process in your pipeline without modifying the `PATH` environment variable or using an absolute path. Each script should include a shebang to specify the interpreter. Inputs should be supplied as arguments. +It allows custom scripts to be invoked like regular commands from any process in your pipeline without modifying the `PATH` environment variable or using an absolute path. Each script should include a shebang to specify the interpreter and inputs should be supplied as arguments. ```python #!/usr/bin/env python From 4d8f2f9a3933af53e90f3c8d9a866f9811f98c3f Mon Sep 17 00:00:00 2001 From: Christopher Hakkaart Date: Wed, 4 Dec 2024 10:35:44 +0100 Subject: [PATCH 21/22] Add in another example and note Signed-off-by: Christopher Hakkaart --- docs/module.md | 12 ++++++------ docs/process.md | 20 +++++++++----------- docs/structure.md | 36 +++++++++++++++++++++++++++++++----- 3 files changed, 46 insertions(+), 22 deletions(-) diff --git a/docs/module.md b/docs/module.md index 1ce6e5282d..e8f900fb65 100644 --- a/docs/module.md +++ b/docs/module.md @@ -193,10 +193,10 @@ Project A └── sayhello ├── sayhello.nf └── templates - └── sayhello.py + └── sayhello.sh ``` -Template files can be invoked like regular scripts from a process in your pipeline using the `template` function. Variables prefixed with the dollar character (`$`) are interpreted as Nextflow variables when the template script is executed by Nextflow. +Template files can be invoked like regular scripts from a process in your pipeline using the `template` function. Variables prefixed with the dollar character (`$`) are interpreted as Nextflow variables when the template file is executed by Nextflow. See {ref}`process-template` for more information utilizing template files. @@ -236,9 +236,9 @@ Template files can also be stored in the project `templates` directory. See {ref :::{versionadded} 22.10.0 ::: -Modules can define binary scripts that are locally scoped to the processes defined by the tasks. +Modules can define binary scripts that are locally scoped to the processes. -Binary scripts must be placed in the module directory named `/resources/usr/bin` and granted execution permissions. For example: +Binary scripts must be placed in the module directory named `/resources/usr/bin`. For example: ``` @@ -249,7 +249,7 @@ Binary scripts must be placed in the module directory named `/resour └── script.py ``` -Binary scripts can be invoked like regular commands from the locally scoped module without modifying the `PATH` environment variable or using an absolute path. Each script should include a shebang to specify the interpreter and inputs should be supplied as arguments. +Binary scripts can be invoked like regular commands from the locally scoped module without modifying the `PATH` environment variable or using an absolute path. Each script should include a shebang to specify the interpreter and inputs should be supplied as arguments. See {ref}`structure-bin` for more information about custom scripts in `bin` directories. To use this feature, the module binaries must be enabled in your pipeline script or configuration file: @@ -261,7 +261,7 @@ nextflow.enable.moduleBinaries = true Module binary scripts require a local or shared file system for the pipeline work directory or {ref}`wave-page` when using cloud-based executors. ::: -Scripts can also be stored at the pipeline level using the `bin` directory. See {ref}`structure-bin` for more information. +Scripts can also be stored in project level `bin` directory. See {ref}`structure-bin` for more information. ## Sharing modules diff --git a/docs/process.md b/docs/process.md index e00b3ffa3f..d8eea5131f 100644 --- a/docs/process.md +++ b/docs/process.md @@ -160,9 +160,9 @@ Process scripts can be externalized to **template** files and reused across mult In template files, variables prefixed with the dollar character (`$`) are interpreted as Nextflow variables when the template script is executed by Nextflow. ``` -#!/usr/bin/env python +#!/usr/bin/env bash -print("Hello ${x}!") +echo "Hello ${x}" ``` Template files can be invoked like regular scripts from any process in your pipeline using the `template` function. @@ -177,7 +177,7 @@ process sayHello { stdout script: - template 'sayhello.py' + template 'sayhello.sh' } workflow { @@ -185,26 +185,24 @@ workflow { } ``` -:::{note} All template variable must be defined. The pipeline will fail if a template variable is missing, regardless of where it occurs in the template. -::: Templates can be tested independently of pipeline execution by providing each input as an environment variable. For example: ```bash -STR='foo' bash templates/myscript.sh +STR='foo' bash templates/sayhello.sh ``` -Template scripts are only recommended for Bash scripts. Languages that do not prefix variables with `$` (e.g. Python and R) can't be executed directly as a template script from the command line as variables prefixed with `$` are interpreted as Bash variables. Similarly, template variables escaped with `\$` will be interpreted as Bash variables when executed by Nextflow but not the command line. - -:::{tip} -The best practice for using a custom script is to first embed it in the process definition and transfer it to a separate file with its own command line interface once the code matures. -::: +Template scripts are only recommended for Bash scripts. Languages that do not prefix variables with `$` (e.g., Python and R) can't be executed directly as a template script from the command line as variables prefixed with `$` are interpreted as Bash variables. Similarly, template variables escaped with `\$` will be interpreted as Bash variables when executed by Nextflow but not the command line. :::{warning} Template variables are evaluated even if they are commented out in the template script. ::: +:::{tip} +The best practice for using a custom script is to first embed it in the process definition and transfer it to a separate file with its own command line interface once the code matures. +::: + (process-shell)= ### Shell diff --git a/docs/structure.md b/docs/structure.md index 417cbb9da6..ba8a4430b4 100644 --- a/docs/structure.md +++ b/docs/structure.md @@ -10,11 +10,11 @@ The `templates` directory in the Nextflow project root can be used to store temp ``` ├── templates -│ └── sayhello.py +│ └── sayhello.sh └── main.nf ``` -Template files can be invoked like regular scripts from any process in your pipeline using the `template` function. Variables prefixed with the dollar character (`$`) are interpreted as Nextflow variables when the template script is executed by Nextflow. +Template files can be invoked like regular scripts from any process in your pipeline using the `template` function. Variables prefixed with the dollar character (`$`) are interpreted as Nextflow variables when the template file is executed by Nextflow. See {ref}`process-template` for more information about utilizing template files. @@ -30,7 +30,7 @@ The `bin` directory in the Nextflow project root can be used to store executable └── main.nf ``` -It allows custom scripts to be invoked like regular commands from any process in your pipeline without modifying the `PATH` environment variable or using an absolute path. Each script should include a shebang to specify the interpreter and inputs should be supplied as arguments. +The `bin` directory allows binary scripts to be invoked like regular commands from any process in your pipeline without using an absolute path of modifying the `PATH` environment variable. Each script should include a shebang to specify the interpreter and inputs should be supplied as arguments to the executable. For example: ```python #!/usr/bin/env python @@ -52,13 +52,39 @@ if __name__ == "__main__": Use `env` to resolve the interpreter's location instead of hard-coding the interpreter path. ::: -Scripts placed in the `bin` directory must have executable permissions. Use `chmod` to grant the required permissions. For example: +Binary scripts placed in the `bin` directory must have executable permissions. Use `chmod` to grant the required permissions. For example: ``` chmod a+x bin/sayhello.py ``` -Like modifying a process script, changing the executable script will cause the task to be re-executed on a resumed run. +Binary scripts in the `bin` directory can then be invoked like regular commands. + +``` +process sayHello { + + input: + val x + + output: + stdout + + script: + """ + sayhello.py --name $x + """ +} + +workflow { + Channel.of("Foo") | sayHello | view +} +``` + +Like modifying a process script, modifying the binary script will cause the task to be re-executed on a resumed run. + +:::{note} +Binary scripts require a local or shared file system for the pipeline work directory or {ref}`wave-page` when using cloud-based executors. +::: :::{warning} When using containers and the Wave service, Nextflow will send the project-level `bin` directory to the Wave service for inclusion as a layer in the container. Any changes to scripts in the `bin` directory will change the layer md5sum and the hash for the final container. The container identity is a component of the task hash calculation and will force re-calculation of all tasks in the workflow. From 0b9c83fff664359dd36adcda8df5795605328bbb Mon Sep 17 00:00:00 2001 From: Christopher Hakkaart Date: Wed, 4 Dec 2024 18:22:33 +0100 Subject: [PATCH 22/22] Apply suggestions from review Signed-off-by: Christopher Hakkaart --- docs/index.md | 2 +- docs/structure.md | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/index.md b/docs/index.md index 2cc132b783..4b2d60d027 100644 --- a/docs/index.md +++ b/docs/index.md @@ -77,8 +77,8 @@ workflow module notifications secrets -sharing structure +sharing vscode dsl1 ``` diff --git a/docs/structure.md b/docs/structure.md index ba8a4430b4..a026e1919c 100644 --- a/docs/structure.md +++ b/docs/structure.md @@ -1,6 +1,6 @@ (structure-page)= -# Structure +# Project structure (structure-templates)=