Skip to content

Commit f914632

Browse files
committed
docs: Update cuda installation and support docs
Added updated documentation for the dso type cuda support and the updated ofi mtl support. Signed-off-by: William Zhang <wilzhang@amazon.com>
1 parent a6522fd commit f914632

File tree

1 file changed

+37
-3
lines changed

1 file changed

+37
-3
lines changed

docs/networking/cuda.rst

Lines changed: 37 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -38,15 +38,32 @@ Open MPI offers two flavors of CUDA support:
3838
shell$ ./configure --prefix=/path/to/ucx-cuda-install --with-cuda=/usr/local/cuda --with-gdrcopy=/usr
3939
4040
# Configure Open MPI this way
41-
shell$ ./configure --with-cuda=/usr/local/cuda --with-ucx=/path/to/ucx-cuda-install <other configure params>
41+
shell$ ./configure --with-cuda=/usr/local/cuda --with-cuda-libdir=/usr/local/cuda/lib64/stubs/ --with-ucx=/path/to/ucx-cuda-install <other configure params>
4242
4343
#. Via internal Open MPI CUDA support
4444

4545
Regardless of which flavor of CUDA support (or both) you plan to use,
4646
Open MPI should be configured using the ``--with-cuda=<path-to-cuda>``
47-
configure option to build CUDA support into Open MPI.
47+
and ``--with-cuda-libdir=<path-to-libcuda.so>`` configure options to
48+
build CUDA support into Open MPI.
4849

49-
This affects the smcuda shared memory btl, as well as the uct btl.
50+
Open MPI supports building with CUDA libraries and running on systems
51+
without CUDA libraries or hardware. In order to take advantage of
52+
this functionality, when compiling, you have to specify the CUDA
53+
dependent components to be built as DSOs using the
54+
``--enable-mca-dso=<comma-delimited-list-of-cuda-components.``
55+
configure option.
56+
57+
This affects the ``smcuda`` shared memory and ``uct`` BTLs, as well
58+
as the ``rgpusm`` and ``gpusm`` rcache components.
59+
60+
An example configure command would look like the following:
61+
62+
.. code-block:: sh
63+
64+
# Configure Open MPI this way
65+
shell$ ./configure --with-cuda=/usr/local/cuda --with-cuda-libdir=/usr/local/cuda/lib64/stubs \
66+
--enable-mca-dso=btl-smcuda,rcache-rgpusm,rcache-gpusm,accelerator-cuda <other configure params>
5067
5168
/////////////////////////////////////////////////////////////////////////
5269

@@ -124,6 +141,7 @@ CUDA-aware support is available in:
124141

125142
* The UCX (``ucx``) PML
126143
* The PSM2 (``psm2``) MTL with the CM (``cm``) PML.
144+
* The OFI (``ofi``) MTL with the CM (``cm``) PML.
127145
* Both CUDA-ized shared memory (``smcuda``) and TCP (``tcp``) BTLs
128146
with the OB1 (``ob1``) PML.
129147
* The HCOLL (``hcoll``) COLL
@@ -152,6 +170,22 @@ For more information refer to the `Intel Omni-Path documentation
152170

153171
/////////////////////////////////////////////////////////////////////////
154172

173+
OFI support for CUDA
174+
---------------------
175+
176+
CUDA-aware support is present in OFI MTL. When running CUDA-aware
177+
Open MPI over Libfabric, the OFI MTL will check if there are any
178+
providers capable of handling GPU (or other accelerator) memory
179+
through the ``hmem``-related flags. If a CUDA-capable provider is
180+
available, the OFI MTL will directly send GPU buffers through
181+
Libfabric's API after registering the memory. If there are no
182+
CUDA-capable providers available, the buffers will automatically
183+
be copied to host buffers before being transferred through
184+
Libfabric's API.
185+
186+
/////////////////////////////////////////////////////////////////////////
187+
188+
155189
How can I tell if Open MPI was built with CUDA support?
156190
-------------------------------------------------------
157191

0 commit comments

Comments
 (0)