-
Notifications
You must be signed in to change notification settings - Fork 795
Open
Labels
OCL CPU Experimental RTIssues in Experimental Intel(R) CPU Runtime for OpenCL(TM) Applications with SYCL supportIssues in Experimental Intel(R) CPU Runtime for OpenCL(TM) Applications with SYCL supportbugSomething isn't workingSomething isn't workingconfirmed
Description
Hi all,
the new cl_khr_integer_dot_product
addition in 2025.1 release is broken.
- On Windows (
2025.19.3.0.17_230222
), the__opencl_c_integer_dot_product_input_4x8bit
and__opencl_c_integer_dot_product_input_4x8bit_packed
feature macros are now present, but thedot(char4, char4)
/dot_acc_sat(char4, char4, int)
functions fail to compile with errorsinstructions in function CompilerException Failed to lookup symbol add_kernel JIT session error: Symbols not found: [ _Z3dotDv4_cS_ ]
/[ _Z11dot_acc_satDv4_cS_i ]
. - On Linux (
2025.19.3.0.17_230222
), bothandint dp4a(const char4 a, const char4 b, const int c) { return c+dot(a, b); // 0.020 TIOPs/s }
perform much slower than the emulation variantint dp4a(const char4 a, const char4 b, const int c) { return dot_acc_sat(a, b, c); // 0.015 TIOPs/s }
as measured on my i7-8700K CPU with my OpenCL-Benchmark. The full dp4a function implementation is here. The performance behavior is the same on AMD Ryzen 9 7950X.int dp4a(const char4 a, const char4 b, const int c) { return c+a.x*b.x+a.y*b.y+a.z*b.z+a.w*b.w; // 0.064 TIOPs/s }
Kind regards,
Moritz
Metadata
Metadata
Assignees
Labels
OCL CPU Experimental RTIssues in Experimental Intel(R) CPU Runtime for OpenCL(TM) Applications with SYCL supportIssues in Experimental Intel(R) CPU Runtime for OpenCL(TM) Applications with SYCL supportbugSomething isn't workingSomething isn't workingconfirmed