Skip to content

Commit 4a3ecf8

Browse files
Update CUDA build docs
1 parent 9751a71 commit 4a3ecf8

File tree

1 file changed

+20
-2
lines changed

1 file changed

+20
-2
lines changed

docs/doc_sources/beginners_guides/installation.rst

Lines changed: 20 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -159,13 +159,31 @@ The following plugins from CodePlay are supported:
159159
.. _codeplay_nv_plugin: https://developer.codeplay.com/products/oneapi/nvidia/
160160
.. _codeplay_amd_plugin: https://developer.codeplay.com/products/oneapi/amd/
161161

162-
``dpctl`` can be built for CUDA devices as follows:
162+
``dpctl`` can be built for CUDA devices using the ``DPCTL_TARGET_CUDA`` CMake option,
163+
which accepts a specific compute architecture string:
164+
165+
.. code-block:: bash
166+
167+
python scripts/build_locally.py --verbose --cmake-opts="-DDPCTL_TARGET_CUDA=sm_80"
168+
169+
To use the default architecture (``sm_50``), use:
163170

164171
.. code-block:: bash
165172
166173
python scripts/build_locally.py --verbose --cmake-opts="-DDPCTL_TARGET_CUDA=ON"
167174
168-
And for AMD devices
175+
Note that kernels are built for ``sm_50`` by default, allowing them to work on a wider
176+
range of architectures, but limiting the usage of more recent CUDA features.
177+
178+
For reference, compute architecture strings like ``sm_80`` are based on
179+
CUDA Compute Capability. A complete mapping between NVIDIA GPU models and their
180+
respective ``sm_XX`` values can be found in the official
181+
`CUDA GPU Compute Capability <https://developer.nvidia.com/cuda-gpus>`_.
182+
183+
A full list of available SYCL alias targets is available in the
184+
`DPC++ Compiler User Manual <https://intel.github.io/llvm/UsersManual.html>`_.
185+
186+
To build for AMD devices, use:
169187

170188
.. code-block:: bash
171189

0 commit comments

Comments
 (0)