Skip to content
This repository was archived by the owner on Apr 28, 2023. It is now read-only.

Commit c200a4e

Browse files
committed
gpu.h: add queryRegistersPerBlock
This platform-neutral function to query the number of registers will be used in an upcoming commit.
1 parent 1d9d6e3 commit c200a4e

File tree

1 file changed

+11
-0
lines changed

1 file changed

+11
-0
lines changed

tc/core/gpu.h

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -36,4 +36,15 @@ inline size_t querySharedMemorySize() {
3636
#endif
3737
}
3838

39+
/// Get the maximum number of registers per block provided by the GPU device
40+
/// active in the current thread. The call is forwarded to the GPU driver.
41+
/// If the thread has no associated GPU, return 0.
42+
inline size_t queryRegistersPerBlock() {
43+
#if TC_WITH_CUDA && !defined(NO_CUDA_SDK)
44+
return CudaGPUInfo::GPUInfo().RegistersPerBlock();
45+
#else
46+
return 0;
47+
#endif
48+
}
49+
3950
} // namespace tc

0 commit comments

Comments
 (0)