You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[SYCLCompat] Optimize/(fix?) permute_sub_group_by_xor if logical_sub_group_size == 32 (#16646)
`syclcompat::permute_sub_group_by_xor` was reported to flakily fail on
L0. Closer inspection revealed that the implementation of
`permute_sub_group_by_xor` is incorrect for cases where
`logical_sub_group_size != 32`, which is one of the test cases. This
implies that the test itself is wrong.
In this PR we first optimize the part of the implementation that is
valid assuming that Intel spirv builtins are correct (which is also the
only case realistically a user will program): case
`logical_sub_group_size == 32`, in order to:
- Ensure the only useful case is working via the correct optimized
route.
- Check that this improvement doesn't break the suspicious test.
A follow on PR can fix the other cases where `logical_sub_group_size !=
32`: this is better to do later, since
- the only use case I know of for this is to implement non-uniform group
algorithms that we already have implemented (e.g. see
#9671) and any user is advised to use
such algorithms instead of reimplementing them themselves.
- This must I think require a complete reworking of the test and would
otherwise delay the more important change here.
---------
Signed-off-by: JackAKirk <jack.kirk@codeplay.com>
0 commit comments