Skip to content

DPC++ daily 2022-08-16

Pre-release
Pre-release
Compare
Choose a tag to compare
@bb-sycl bb-sycl released this 16 Aug 16:21
· 129035 commits to sycl since this release
3323da6
[SYCL] Improve range reduction performance on CPU (#6164)

The performance improvement is the result of two complementary changes:

Using an alternative heuristic to select work-group size on the CPU.
Keeping work-groups small simplifies combination of partial results
and reduces the number of temporary variables.

Adjusting the mapping of the range to an ND-range.
Breaking the range into contiguous chunks that are assigned to each
results in streaming patterns that are better-suited to prefetching
hardware.

Signed-off-by: John Pennycook john.pennycook@intel.com