Skip to content

add basic support for SVM and USM capture and replay #419

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
May 22, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion docs/capture_single_kernels.md
Original file line number Diff line number Diff line change
Expand Up @@ -54,13 +54,16 @@ If the buffers don't agree, it will show a message in the terminal.
* Device only buffers, i.e. those with `CL_MEM_HOST_NO_ACCESS`. When kernel capture is enabled, any device-only access flags are removed.
* OpenCL Images
* 2D, and 3D images are supported.
* OpenCL SVM and USM
* Pointers to the base of an SVM or USM allocation are supported.
* OpenCL Samplers
* OpenCL Kernels from source or IL
* OpenCL Kernels from device binary

## Limitations (incomplete)

* Does not work with OpenCL SVM or USM.
* Does not work with pointers to the middle of an OpenCL SVM or USM allocation.
* Does not work with SVM or USM indirect access, where the SVM or USM allocation is not set as a kernel argument.
* Does not work with OpenCL pipes.
* Untested for out-of-order queues.
* Sub-buffers are not dealt with explicitly, this may affect the results for both debugging and performance.
Expand Down
2 changes: 2 additions & 0 deletions intercept/scripts/run.py
Original file line number Diff line number Diff line change
Expand Up @@ -162,8 +162,10 @@ def sampler_from_string(ctx, sampler_descr):
try:
prg = cl.Program(ctx, [device], [binaries[idx]]).build(options)
getattr(prg, kernel_name)
print(f"Successfully loaded kernel device binary file: {binary_files[idx]}")
break
except Exception as e:
print(f"Failed to load kernel device binary file: {binary_files[idx]}")
pass

kernel = getattr(prg, kernel_name)
Expand Down
20 changes: 20 additions & 0 deletions intercept/src/intercept.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -7706,6 +7706,16 @@ void CLIntercept::setKernelArgSVMPointer(
CArgMemMap& argMemMap = m_KernelArgMemMap[ kernel ];
argMemMap[ arg_index ] = startPtr;
}

// Currently, only pointers to the start of an SVM allocation are supported for
// capture and replay.
if( arg == startPtr )
{
CArgDataMap& argDataMap = m_KernelArgDataMap[kernel];
const uint8_t* pRawArgData = reinterpret_cast<const uint8_t*>(&arg);
argDataMap[ arg_index ] = std::vector<uint8_t>(
pRawArgData, pRawArgData + sizeof(void*) );
}
}

///////////////////////////////////////////////////////////////////////////////
Expand Down Expand Up @@ -7736,6 +7746,16 @@ void CLIntercept::setKernelArgUSMPointer(
CArgMemMap& argMemMap = m_KernelArgMemMap[ kernel ];
argMemMap[ arg_index ] = startPtr;
}

// Currently, only pointers to the start of an SVM allocation are supported for
// capture and replay.
if( arg == startPtr )
{
CArgDataMap& argDataMap = m_KernelArgDataMap[kernel];
const uint8_t* pRawArgData = reinterpret_cast<const uint8_t*>(&arg);
argDataMap[ arg_index ] = std::vector<uint8_t>(
pRawArgData, pRawArgData + sizeof(void*) );
}
}

///////////////////////////////////////////////////////////////////////////////
Expand Down