This repo is provinding solutions in Rust to the exercises created by Simon McIntosh-Smith and Tom Deakin from the University of Bristol in the UK, with financial support from the Khronos Initiative for Training and Education KITE to promote the use of open standards.
The slides are available under Releases. The code exercises and solutions for C,C++ and python are available in the Exercises and Solutions repository.
As the slides are little old and mention a different way of installing OpenCL, it might be required to read the particular documentation of the platform on which this solutions would be run. For the purpose of testing, the solutions were run on a AMD integrated GPU. For installation of OpenCL libraries and OpenCL Installabale Client Driver(ICD) for AMD GPUs, please have a look at ROCm
The following installation was done on Fedora Linux 41 (Workstation Edition)
- Install amdgpu-install
# Follow the installation guide for RHEL
$ sudo dnf install clinfo OpenCL-ICD-Loader rocm-opencl
$ sudo dnf install https://repo.radeon.com/amdgpu-install/6.3.1/rhel/9.5/amdgpu-install-6.3.60301-1.el9.noarch.rpm
- Install opencl using the usecase option
# List the available use cases
$ sudo amdgpu-install --list-usecase
# Install the use case opencl
$ amdgpu-install --usecase=opencl
- Verify installation using clinfo
# Run clinfo to verify
$ clinfo
The OpenCL ICD is used to load the correct library for the device during runtime. This library is located as per the vendor file entry:
$ cat /etc/OpenCL/vendors/amdocl64_60301_48.icd
$ libamdocl64.so
Make sure rust toolchain is installed. Although the tutorials are based on the OpenCL 1.x standard, we can still follow most of the exercises using Open CL 3.0. The opencl3 dependency does a lot of heavy lifting. However, it searches actively for the libOpenCL.so in the LD_LIBRARY_PATH
. From the above step, we need to create a symbolic link called libOpenCL.so
.
$ sudo ln -s /opt/rocm/lib/libOpenCL.so /opt/rocm/lib/libamdocl64.so
and add the path to cargo build using the following code as present in build.rs
// Set the path to the directory containing libamdocl64.so
println!("cargo:rustc-link-search=native=/opt/rocm/lib");
clang-format
might be needed to format the OpenCL kernels.
$ sudo dnf install clang-tools-extra
Simply running the binaries runs the solution for the given exercise. For example:
$ cargo run --bin Tutorial-1
- The solutions have been run only on a machine with a single OpenCL device, however the tutorial mentions many times comparing between multiple devices and multiple device types.
- Not all solutions have been attempted here. The solutions involving poritng CUDA to OpenCL for example was not attempted.
- There is a scope for the repo to be more rusty. For example, using builder pattern to construct the kernel object or to create library functions with generics to copy the memory from host to device. This can be part of the future work. Creating a library like this would allow abstraction of OpenCL related functions for easy integration.
- The exercises involving improvements by running micro-benchmarks was not possible. There was only one device available and this decvice was haivng a better performance using global memory rather than splitting the task in work groups and utilizing local memory.
Extrae
andParaver
are good tools, butROCm
also supports toosl for benchmarking like ROC Profiler.- The tutorial does not talk about extensions in OpenCL, which might be relevant depending on the application being optimized for heterogenous parallel computing.