You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[SYCL][NATIVECPU][PERF] Reduce thread local usage (#17822)
This PR no longer generates thread_local pointers for kernels calling
other kernels, which happens for example in the work_item loop. Instead
of storing the state struct pointer in the thread local, it is passed
directly to the called kernel function which was duplicated with an
additional state struct pointer parameter if it didn't already have one.
The state getter functions (native_cpu state and corresponding mux and
nativecpu spirv functions) have been made __attribute((pure)) to enable
more optimizations (including removal of unused calls to such builtins)
before the NativeCPU passes.
Pointer parameters of the native_cpu getter functions now point to
constant data.
0 commit comments