Skip to content

Commit 430fc40

Browse files
authored
WX-1626 Docker soft links (#6741)
1 parent 7d5d5fd commit 430fc40

File tree

5 files changed

+109
-5
lines changed

5 files changed

+109
-5
lines changed

CHANGELOG.md

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,12 @@
11
# Cromwell Change Log
22

3+
## 88 Release Notes
4+
5+
### Optional docker soft links
6+
7+
Cromwell now allows opting into configured soft links on shared file systems such as HPC environments. More details can
8+
be found [here](https://cromwell.readthedocs.io/en/stable/backends/HPC/#optional-docker-soft-links).
9+
310
## 87 Release Notes
411

512
### `upgrade` command removed from Womtool

docs/Configuring.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -519,7 +519,7 @@ config section:
519519
md5.
520520
* `md5`. The well-known md5sum algorithm
521521
* Path based options. These are based on filepath. Extremely lightweight, but only work with the `soft-link` file
522-
caching strategy and can therefore never work with containers.
522+
caching strategy and can therefore do not work with containers by default.
523523
* `path` creates a md5 hash of the path.
524524
* `path+modtime` creates a md5 hash of the path and its modification time.
525525
* Fingerprinting. This strategy works with containers.

docs/backends/HPC.md

Lines changed: 69 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -21,13 +21,14 @@ Each `call` has its own subdirectory located at `<workflow_root>/call-<call_name
2121
Any input files to a call need to be localized into the `<call_dir>/inputs` directory. There are different localization strategies that Cromwell will try until one works:
2222

2323
* `hard-link` - This will create a hard link to the file
24-
* `soft-link` - Create a symbolic link to the file. This strategy is not applicable for tasks which specify a Docker image and will be ignored.
24+
* `soft-link` - Create a symbolic link to the file. This strategy is not enabled by default for tasks which specify a
25+
Docker image and will be ignored.
2526
* `copy` - Make a copy the file
2627
* `cached-copy` An experimental feature. This copies files to a file cache in
2728
`<workflow_root>/cached-inputs` and then hard links them in the `<call_dir>/inputs` directory.
2829

2930
`cached-copy` is intended for a shared filesystem that runs on multiple physical disks, where docker containers are used.
30-
Hard-links don't work between different physical disks and soft-links don't work with docker. Copying uses a lot of
31+
Hard-links don't work between different physical disks and soft-links don't work with docker by default. Copying uses a lot of
3132
space if a multitude of tasks use the same input. `cached-copy` copies the file only once to the physical disk containing
3233
the `<workflow_root>` and then uses hard links for every task that needs the input file. This can save a lot of space.
3334

@@ -45,6 +46,72 @@ filesystems {
4546
}
4647
```
4748

49+
### Optional docker soft links
50+
51+
By default when Cromwell runs a local container it only mounts the workflow's execution directory. Thus any symbolic or
52+
soft links pointing to files outside of the execution directory will resolve to paths that are not accessible within the
53+
container.
54+
55+
As discussed above regarding `cache-copy`, `soft-link` is disabled by default on docker and other container
56+
environments, and hard-links do not work across different physical disks.
57+
58+
However, it is possible to manually configure Cromwell to mount input paths such that soft links resolve outside and
59+
inside containers.
60+
61+
```hocon
62+
backend {
63+
default = "SlurmDocker"
64+
providers {
65+
SlurmDocker {
66+
actor-factory = "cromwell.backend.impl.sfs.config.ConfigBackendLifecycleActorFactory"
67+
config {
68+
runtime-attributes = """
69+
String? docker
70+
"""
71+
# https://slurm.schedmd.com/sbatch.html
72+
submit-docker = """
73+
set -euo pipefail
74+
CACHE_DIR=$HOME/.singularity/cache
75+
mkdir -p $CACHE_DIR
76+
LOCK_FILE=$CACHE_DIR/singularity_pull_flock
77+
DOCKER_NAME=$(sed -e 's/[^A-Za-z0-9._-]/_/g' <<< ${docker})
78+
IMAGE=$CACHE_DIR/$DOCKER_NAME.sif
79+
(
80+
flock --verbose --exclusive --timeout 900 9 || exit 1
81+
if [ ! -e "$IMAGE" ]; then
82+
singularity build $IMAGE docker://${docker}
83+
fi
84+
) 9>$LOCK_FILE
85+
sbatch \
86+
-J ${job_name} \
87+
-D ${cwd} \
88+
-o ${cwd}/execution/stdout \
89+
-e ${cwd}/execution/stderr \
90+
--wrap "singularity exec --containall --bind ${cwd}:${docker_cwd} --bind /mnt/one:/mnt/one:ro --bind /mnt/two:/mnt/two:ro $IMAGE ${job_shell} ${docker_script}"
91+
"""
92+
# ... other configuration ...
93+
filesystems {
94+
local {
95+
caching.duplication-strategy = ["copy"]
96+
localization = ["soft-link", "copy"]
97+
docker.allow-soft-links: true
98+
}
99+
}
100+
}
101+
}
102+
}
103+
}
104+
```
105+
106+
The important parts of the example configuration above are:
107+
* `config.filesystems.local.docker.allow-soft-links` set to `true`
108+
* `config.submit-docker` containing `--bind /mnt/one:/mnt/one:ro --bind /mnt/two:/mnt/two:ro`
109+
110+
In this example the two directories `/mnt/one` and and `/mnt/two` will also be available within containers at their
111+
original paths outside the container. So soft links pointing to paths under those directories will resolve during the
112+
job execution. Note that if a user runs a workflow using an input file `/mnt/three/path/to/file` the job will fail
113+
during execution as `/mnt/three` was not present inside the running container.
114+
48115
### Additional FileSystems
49116

50117
HPC backends (as well as the Local backend) can be configured to be able to interact with other type of filesystems, where the input files can be located for example.

supportedBackends/sfs/src/main/scala/cromwell/backend/sfs/SharedFileSystem.scala

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -115,6 +115,7 @@ trait SharedFileSystem extends PathFactory {
115115
def sharedFileSystemConfig: Config
116116
lazy val maxHardLinks: Int =
117117
sharedFileSystemConfig.getOrElse[Int]("max-hardlinks", 950) // Windows limit 1024. Keep a safe margin.
118+
lazy val dockerAllowSoftLinks: Boolean = sharedFileSystemConfig.getOrElse("docker.allow-soft-links", false)
118119
lazy val cachedCopyDir: Option[Path] = None
119120

120121
private def localizePathViaCachedCopy(originalPath: Path, executionPath: Path, docker: Boolean): Try[Unit] = {
@@ -217,10 +218,10 @@ trait SharedFileSystem extends PathFactory {
217218
}
218219

219220
private def createStrategies(configStrategies: Seq[String], docker: Boolean): Seq[DuplicationStrategy] = {
220-
// If localizing for a docker job, remove soft-link as an option
221+
// If localizing for a docker job, by default remove soft-link as an option
221222
// If no cachedCopyDir is defined, cached-copy can not be used and is removed.
222223
val filteredConfigStrategies = configStrategies filter {
223-
case "soft-link" if docker => false
224+
case "soft-link" if docker => dockerAllowSoftLinks
224225
case "cached-copy" if cachedCopyDir.isEmpty => false
225226
case _ => true
226227
}

supportedBackends/sfs/src/test/scala/cromwell/backend/sfs/SharedFileSystemSpec.scala

Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,12 @@ class SharedFileSystemSpec
2929
private val cachedCopyLocalization = ConfigFactory.parseString(""" localization: [cached-copy] """)
3030
private val cachedCopyLocalizationMaxHardlinks =
3131
ConfigFactory.parseString("""{localization: [cached-copy], max-hardlinks: 3 }""")
32+
private val softLinkDockerLocalization = ConfigFactory.parseString(
33+
"""
34+
|localization: [soft-link]
35+
|docker.allow-soft-links: true
36+
|""".stripMargin
37+
)
3238
private val localPathBuilder = List(DefaultPathBuilder)
3339

3440
def localizationTest(config: Config,
@@ -104,6 +110,7 @@ class SharedFileSystemSpec
104110

105111
it should "localize a file via symbolic link" in {
106112
localizationTest(softLinkLocalization, docker = false, symlink = true)
113+
localizationTest(softLinkDockerLocalization, docker = true, symlink = true)
107114
}
108115

109116
it should "localize a file via cached copy" in {
@@ -182,6 +189,28 @@ class SharedFileSystemSpec
182189
dests.foreach(_.delete(swallowIOExceptions = true))
183190
}
184191

192+
it should "throw a fatal exception if docker soft link localization fails" in {
193+
val callDir = DefaultPathBuilder.createTempDirectory("SharedFileSystem")
194+
val orig = DefaultPathBuilder.createTempFile("inputFile")
195+
val testText =
196+
"""This is a simple text to check if the localization
197+
| works correctly for the file contents.
198+
|""".stripMargin
199+
orig.touch()
200+
orig.writeText(testText)
201+
202+
val inputs = fqnWdlMapToDeclarationMap(Map("input" -> WomSingleFile(orig.pathAsString)))
203+
val sharedFS: SharedFileSystem = new SharedFileSystem {
204+
override val pathBuilders: PathBuilders = localPathBuilder
205+
override val sharedFileSystemConfig: Config = softLinkLocalization
206+
207+
implicit override def actorContext: ActorContext = null
208+
}
209+
val result = sharedFS.localizeInputs(callDir, docker = true)(inputs)
210+
result.isFailure shouldBe true
211+
result.failed.get.isInstanceOf[CromwellFatalExceptionMarker] shouldBe true
212+
}
213+
185214
private[this] def countLinks(file: Path): Int = file.getAttribute("unix:nlink").asInstanceOf[Int]
186215

187216
private[this] def isSymLink(file: Path): Boolean = file.isSymbolicLink

0 commit comments

Comments
 (0)