Skip to content

Commit 62d7bfd

Browse files
author
Pietro Vertechi
committed
document support for CUDAnative
1 parent 4483b57 commit 62d7bfd

File tree

1 file changed

+27
-0
lines changed

1 file changed

+27
-0
lines changed

README.md

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -248,3 +248,30 @@ false
248248
Since the original array `dest` cannot hold the input, a new array is created (`ans !== dest`).
249249

250250
Combined with [function barriers](https://docs.julialang.org/en/latest/manual/performance-tips/#kernel-functions-1), `append!!` is a useful building block for implementing `collect`-like functions.
251+
252+
## Advanced: using StructArrays in CUDA kernels
253+
254+
It is possible to combine StructArrays with [CUDAnative](https://github.com/JuliaGPU/CUDAnative.jl), in order to create CUDA kernels that work on StructArrays directly on the GPU. Make sure you are familiar with the CUDAnative documentation (esp. kernels with plain `CuArray`s) before experimenting with kernels based on `StructArray`s.
255+
256+
```julia
257+
using CUDAnative, CuArrays, StructArrays
258+
d = StructArray(a = rand(100), b = rand(100))
259+
260+
# move to GPU
261+
dd = replace_storage(CuArray, d)
262+
de = similar(dd)
263+
264+
# a simple kernel, to copy the content of `dd` onto `de`
265+
function kernel!(dest, src)
266+
i = (blockIdx().x-1)*blockDim().x + threadIdx().x
267+
if i <= length(dest)
268+
dest[i] = src[i]
269+
end
270+
return nothing
271+
end
272+
273+
threads = 1024
274+
blocks = cld(length(dd),threads)
275+
276+
@cuda threads=threads blocks=blocks kernel!(de, dd)
277+
```

0 commit comments

Comments
 (0)