Skip to content

Commit d35dd3e

Browse files
committed
Update README to describe new structure
1 parent f06259d commit d35dd3e

File tree

2 files changed

+44
-2
lines changed

2 files changed

+44
-2
lines changed

README.md

Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -37,3 +37,45 @@ $ git pull
3737
```
3838

3939
Refer to each example and/or exercise for detailed instructions on how to run it.
40+
41+
Examples
42+
=========
43+
44+
[Vector Addition](examples/vector_addition)
45+
--------------------------------------------
46+
47+
This trivial example can be used to compare a simple vector addition in CUDA to
48+
an equivalent implementation in SYCL for CUDA. The aim of the example is also
49+
to highlight how to build an application with SYCL for CUDA using DPC++ support,
50+
for which an example CMakefile is provided.
51+
52+
[CUDA Interop](examples/cuda_interop)
53+
--------------------------------------------
54+
55+
In this example, we re-use the trivial SYCL kernel we used in Vector Addition, but instead of writing the SYCL variant, we will keep the original CUDA kernel, only replacing the CUDA Runtime calls with the SYCL API.
56+
57+
[Fortran Interface](examples/fortran_interface)
58+
--------------------------------------------
59+
60+
This demonstrates an example of how to call a SYCL function from a CUDA fortran code.
61+
62+
[MPI](examples/MPI)
63+
--------------------------------------------
64+
65+
This example shows how to integrate MPI calls within the SYCL DAG using Host Tasks for integration.
66+
67+
68+
[SGEMM Interop](examples/sgemm_interop)
69+
--------------------------
70+
71+
This demonstrates using SYCL's `host_task` for CUDA interoperability, calling CUBLAS's SGEMM routine for matrix multiplication.
72+
73+
[Distributed (MPI) GEMM](examples/distrib_batch_gemm)
74+
--------------------------------------------
75+
76+
This example combines the MPI and SGEMM Interop examples to distribute a matrix multiplication problem between MPI ranks.
77+
78+
[Hashing Algorithms](examples/hashing)
79+
--------------------------------------------
80+
81+
This example is slightly different - it benchmarks a series of hashing algorithms.

examples/cuda_interop/README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
1-
Example 03: Calling CUDA kernels from SYCL
1+
Calling CUDA kernels from SYCL
22
===============================
33

4-
In this example, we re-use the trivial SYCL kernel we used on Example 1,
4+
In this example, we re-use the trivial SYCL kernel we used in Vector Addition,
55
but instead of writing the SYCL variant, we will keep the original CUDA
66
kernel, only replacing the CUDA Runtime calls with the SYCL API.
77

0 commit comments

Comments
 (0)