|
1 | 1 | # Known issues
|
2 | 2 |
|
| 3 | +## Julia module precompilation |
| 4 | + |
| 5 | +If multiple MPI ranks trigger Julia's module precompilation, then a race condition can result in an error such as: |
| 6 | +``` |
| 7 | +ERROR: LoadError: IOError: mkdir: file already exists (EEXIST) |
| 8 | +Stacktrace: |
| 9 | + [1] uv_error at ./libuv.jl:97 [inlined] |
| 10 | + [2] mkdir(::String; mode::UInt16) at ./file.jl:177 |
| 11 | + [3] mkpath(::String; mode::UInt16) at ./file.jl:227 |
| 12 | + [4] mkpath at ./file.jl:222 [inlined] |
| 13 | + [5] compilecache_path(::Base.PkgId) at ./loading.jl:1210 |
| 14 | + [6] compilecache(::Base.PkgId, ::String) at ./loading.jl:1240 |
| 15 | + [7] _require(::Base.PkgId) at ./loading.jl:1029 |
| 16 | + [8] require(::Base.PkgId) at ./loading.jl:927 |
| 17 | + [9] require(::Module, ::Symbol) at ./loading.jl:922 |
| 18 | + [10] include(::Module, ::String) at ./Base.jl:377 |
| 19 | + [11] exec_options(::Base.JLOptions) at ./client.jl:288 |
| 20 | + [12] _start() at ./client.jl:484 |
| 21 | +``` |
| 22 | + |
| 23 | +See [julia issue #30174](https://github.com/JuliaLang/julia/pull/30174) for more discussion of this problem. There are similar issues with Pkg operations, see [Pkg issue #1219](https://github.com/JuliaLang/Pkg.jl/issues/1219). |
| 24 | + |
| 25 | +This can be worked around be either: |
| 26 | + |
| 27 | +1. Triggering precompilation before launching MPI processes, for example: |
| 28 | + |
| 29 | + ``` |
| 30 | + julia --project -e 'using Pkg; pkg"instantiate"' |
| 31 | + julia --project -e 'using Pkg; pkg"precompile"' |
| 32 | + mpiexec julia --project script.jl |
| 33 | + ``` |
| 34 | + |
| 35 | +2. Launching julia with the `--compiled-modules=no` option. This can result in much longer package load times. |
| 36 | + |
3 | 37 | ## UCX
|
4 | 38 |
|
5 | 39 | [UCX](https://www.openucx.org/) is a communication framework used by several MPI implementations.
|
6 | 40 |
|
7 | 41 | ### Memory cache
|
8 | 42 |
|
9 | 43 | When used with CUDA, UCX intercepts `cudaMalloc` so it can determine whether the pointer passed to MPI is on the host (main memory) or the device (GPU). Unfortunately, there are several known issues with how this works with Julia:
|
10 |
| -- https://github.com/openucx/ucx/issues/5061 |
11 |
| -- https://github.com/openucx/ucx/issues/4001 (fixed in UCX v1.7.0) |
| 44 | +- [UCX issue #5061](https://github.com/openucx/ucx/issues/5061) |
| 45 | +- [UCX issue #4001](https://github.com/openucx/ucx/issues/4001) (fixed in UCX v1.7.0) |
12 | 46 |
|
13 | 47 | By default, MPI.jl disables this by setting
|
14 | 48 | ```
|
|
0 commit comments