Replies: 1 comment 3 replies
-
These appear to both be native methods, so how are you calling them in C#?
The method call is documented as being async anyways, so that you haven't had an issue yet is probably a minor miracle.
From the look of the library, it appears that the intended pattern is to submit a bunch of work to a stream, then poll/wait on events or other related streams. If this were a game engine, you'd probably be checking at the start of each frame or something, and freeing/reusing based on that.
You can't used |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
CUDA has the ability of using streams which are essentially an asynchronous pipeline. Synchronous methods for example
CUresult cuMemcpy (CUdeviceptr dst, CUdeviceptr src, size_t ByteCount)
whereas the async method has
CUresult cuMemcpyAsync (CUdeviceptr dst, CUdeviceptr src, size_t ByteCount, CUstream hStream)
which gets an additional stream parameter. The stream (if used) parameter then flows from cuda function call to next where the stream can be seen as a waithandle where each function waits until the stream is signalled by the preceding method.
While this works well when working with unmanaged memory or value types it's getting a problem with unpinned managed memory.
Normally everytime you have to send memory to the GPU a pointer is involved that forces me to pin the data (mostly arrays of integral data types) and if the method is finished I can unpin it again. This works well for the synchronous case.
The async case is a bit more tricky since I don't know when the method as finished copying
I assume as the comment in the example above says that the array is unpinned at the end of the curly bracket. Which means the GC is free to move or event to collect the unused memory. While you can prevent the collection with
GC.KeepAlive()
the unpinned data is the thing that hurts more. Currently I haven't experienced problems because I'm not working with streams. But without using streams you are using performance.There are also no callbacks where I get informed that I could unpin, bu even then I would have to work with
GCHandle.Alloc()
where I'm not sure in such a performance sensitive area.Is there any solution or opinio of how to solve this?
Beta Was this translation helpful? Give feedback.
All reactions