@@ -92,8 +92,8 @@ git clone --config core.autocrlf=false https://github.com/intel/llvm -b sycl
92
92
## Build DPC++ toolchain
93
93
94
94
The easiest way to get started is to use the buildbot
95
- [ configure] ( ../.. /buildbot/configure.py) and
96
- [ compile] ( ../.. /buildbot/compile.py) scripts.
95
+ [ configure] ( https://github.com/intel/llvm/blob/sycl /buildbot/configure.py) and
96
+ [ compile] ( https://github.com/intel/llvm/blob/sycl /buildbot/compile.py) scripts.
97
97
98
98
In case you want to configure CMake manually the up-to-date reference for
99
99
variables is in these files. Note that the CMake variables set by default by the [ configure.py] ( ../../buildbot/configure.py ) script are the ones commonly used by
@@ -237,21 +237,21 @@ LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$DPCPP_HOME/llvm/build/lib ./a.out
237
237
238
238
### Build DPC++ toolchain with support for HIP AMD
239
239
240
- There is beta support for oneAPI DPC++ for HIP on AMD devices. It is not feature
241
- complete and it still contains known and unknown bugs. Currently it has only
242
- been tried on Linux, with ROCm 4.2.0, 4.3.0, 4.5.2, 5.3.0, and 5.4.3, using the
243
- AMD Radeon Pro W6800 (gtx1030), MI50 (gfx906), MI100 (gfx908) and MI250x
244
- (gfx90a) devices. The backend is tested by a relevant device/toolkit prior to a
245
- oneAPI plugin release. Go to the plugin release
246
- [ pages] ( https://developer.codeplay.com/products/oneapi/amd ) for further details.
247
-
248
240
To enable support for HIP devices, follow the instructions for the Linux DPC++
249
241
toolchain, but add the ` --hip ` flag to ` configure.py ` .
250
242
251
243
Enabling this flag requires an installation of ROCm on the system, for
252
244
instruction on how to install this refer to
253
245
[ AMD ROCm Installation Guide for Linux] ( https://rocmdocs.amd.com/en/latest/Installation_Guide/Installation-Guide.html ) .
254
246
247
+ ROCm versions above 5.7 are recommended as earlier versions don't have graph
248
+ support. DPC++ aims to support new ROCm versions as they come out, so there may
249
+ be a delay but generally the latest ROCm version should work. The ROCm support
250
+ is mostly tested on AMD Radeon Pro W6800 (gfx1030), and MI250x (gfx90a), however
251
+ other architectures supported by LLVM may work just fine. The full list of ROCm
252
+ versions tested prior to oneAPI releases are listed on the plugin release
253
+ [ pages] ( https://developer.codeplay.com/products/oneapi/amd ) .
254
+
255
255
The DPC++ build assumes that ROCm is installed in ` /opt/rocm ` , if it is
256
256
installed somewhere else, the directory must be provided through the CMake
257
257
variable ` UR_HIP_ROCM_DIR ` which can be passed through to cmake using the
@@ -280,7 +280,10 @@ by default when configuring for HIP. For more details on building LLD refer to
280
280
281
281
### Build DPC++ toolchain with support for HIP NVIDIA
282
282
283
- There is experimental support for oneAPI DPC++ for HIP on Nvidia devices.
283
+ HIP applications can be built to target Nvidia GPUs, so in theory it is possible
284
+ to build the DPC++ HIP support for Nvidia, however this is not supported, so it
285
+ may not work.
286
+
284
287
There is no continuous integration for this and there are no guarantees for
285
288
supported platforms or configurations.
286
289
@@ -292,13 +295,12 @@ To enable support for HIP NVIDIA devices, follow the instructions for the Linux
292
295
DPC++ toolchain, but add the ` --hip ` and ` --hip-platform NVIDIA ` flags to
293
296
` configure.py ` .
294
297
295
- Enabling this flag requires HIP to be installed, more specifically
296
- [ HIP NVCC] ( https://rocmdocs.amd.com/en/latest/Installation_Guide/HIP-Installation.html#nvidia-platform ) ,
297
- as well as the CUDA Runtime API to be installed, see
298
- [ NVIDIA CUDA Installation Guide for Linux] ( https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html ) .
299
-
300
- Currently, this has only been tried on Linux, with ROCm 4.2.0 or 4.3.0, with
301
- CUDA 11, and using a GeForce 1060 device.
298
+ Enabling this flag requires HIP to be installed, specifically for Nvidia, see
299
+ the Nvidia tab on the HIP installation docs
300
+ [ here] ( https://rocm.docs.amd.com/projects/HIP/en/latest/install/install.html ) ,
301
+ as well as the CUDA Runtime API to be installed, see [ NVIDIA CUDA Installation
302
+ Guide for
303
+ Linux] ( https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html ) .
302
304
303
305
### Build DPC++ toolchain with support for ARM processors
304
306
@@ -736,14 +738,6 @@ clang++ -fsycl -fsycl-targets=nvptx64-nvidia-cuda \
736
738
The results are correct!
737
739
` ` `
738
740
739
- ** NOTE** : Currently, when the application has been built with the CUDA target,
740
- the CUDA backend must be selected at runtime using the ` ONEAPI_DEVICE_SELECTOR`
741
- environment variable.
742
-
743
- ` ` ` bash
744
- ONEAPI_DEVICE_SELECTOR=cuda:* ./simple-sycl-app-cuda.exe
745
- ` ` `
746
-
747
741
** NOTE** : oneAPI DPC++/SYCL developers can specify SYCL device for execution
748
742
using device selectors (e.g. ` sycl::cpu_selector_v` , ` sycl::gpu_selector_v` ,
749
743
[Intel FPGA selector(s)](extensions/supported/sycl_ext_intel_fpga_device_selector.asciidoc))
@@ -777,6 +771,14 @@ clang++ -fsycl -fsycl-targets=nvptx64-nvidia-cuda \
777
771
-Xsycl-target-backend --cuda-gpu-arch=sm_80
778
772
` ` `
779
773
774
+ Additionally AMD and Nvidia targets also support aliases for the target to
775
+ simplify passing the specific architectures, for example
776
+ ` -fsycl-targets=nvidia_gpu_sm_80` is equivalent to
777
+ ` -fsycl-targets=nvptx64-nvidia-cuda -Xsycl-target-backend
778
+ --cuda-gpu-arch=sm_80` , the full list of available aliases is documented in the
779
+ [Users Manual](UsersManual.md#generic-options), for the ` -fsycl-targets`
780
+ option.
781
+
780
782
To build simple-sycl-app ahead of time for GPU, CPU or Accelerator devices,
781
783
specify the target architecture. The examples provided use a supported
782
784
alias for the target, representing a full triple. Additional details can
@@ -945,11 +947,14 @@ int CUDASelector(const sycl::device &Device) {
945
947
946
948
### HIP back-end limitations
947
949
948
- * Requires a ROCm compatible operating system, for full details of supported
949
- Operating System for ROCm, please refer to the
950
- [ROCm Supported Operating Systems](https://github.com/RadeonOpenCompute/ROCm#supported-operating-systems).
951
- * Support is still in a beta state, but the backend is being actively developed.
952
- * Global offsets are currently not supported.
950
+ * Requires a ROCm compatible system and GPU, see for
951
+ [Linux](https://rocm.docs.amd.com/projects/install-on-linux/en/latest/reference/system-requirements.html#supported-skus)
952
+ and for
953
+ [Windows](https://rocm.docs.amd.com/projects/install-on-windows/en/latest/reference/system-requirements.html#supported-skus).
954
+ * Windows for HIP is not supported by DPC++ at the moment so it may not work.
955
+ * `printf` within kernels is not supported.
956
+ * C++ standard library functions using complex types are not supported,
957
+ `sycl::complex` should be used instead.
953
958
954
959
## Find More
955
960
0 commit comments