-
Notifications
You must be signed in to change notification settings - Fork 714
Description
Bug report
Expected behavior and actual behavior
Expected behavior is that publishDir
directive should work with Azure links, using different formats.
Actual behavior is that publishDir
fails for:
- Azure links starting with
https://
- or Azure links with paths containing the storage account
az://<storage-account>.<bucket>
Steps to reproduce the problem
- set up Nextflow with Azure Cloud (basic set up)
- run the nf-canary pipeline
- pass in differently formatted paths for params.outdir
Working example:
nextflow run https://github.com/seqeralabs/nf-canary -r main -w az://nf-scratch/work --outdir "az://test-public" # succeeds
Failing example1 - storage account in the path:
nextflow run https://github.com/seqeralabs/nf-canary -r main -w az://nf-scratch/work --outdir "az://nfazurestore.test-public" # fail
ERROR ~ Error executing process > 'NF_CANARY:TEST_PUBLISH_FOLDER'
Caused by:
/nfazurestore.test-public: Unable to determine if root directory exists
Failing example 2 - https path used:
nextflow run https://github.com/seqeralabs/nf-canary -r main -w az://nf-scratch/work --outdir "https://nfazurestore.blob.core.windows.net/test-public" # fail
ERROR ~ Error executing process > 'NF_CANARY:TEST_PUBLISH_FOLDER'
Caused by:
Create directory not supported by HTTPS file system provider
Root cause of failures is:
- first in
FileHelper.groovy
paths get transformed into canonicalPath (for example into/<storage-acccount>.<bucket>
) - then
Files.createDirectories(this.path)
fails with the given error message
Environment
- Nextflow version: 23.12.0-edge build 5901
- Java version: openjdk 21.0.1 2023-10-17 LTS
- Operating system: macOS Sonoma - 14.2.1 (23C71)
- Bash version: zsh 5.9 (x86_64-apple-darwin23.0)
Additional context
Reasoning for path with storage account name included support:
Azure bucket/container names are not unique, they are only unique in a storage account. So to be able to identify them correctly, in Seqera Platform the following path format is used az://<storage-acccount>.<bucket>
. Because Nextflow has knowledge of the storage account name - it has to be set up in the config - this part could be easily removed from the path, fixing the issue.
Reasoning for path with https support:
Azure docs about referencing blobs suggest using an URL like this: https://<storage-acccount>.blob.core.windows.net/<bucket>
.