Skip to content

v2.2.3 - SplitU and WorkGroupMapping

Compare
Choose a tag to compare
@guacamoleo guacamoleo released this 30 Mar 17:18
· 4964 commits to develop_deprecated since this release

SplitU
If you have large summations but small C tensor, then you can create extra parallelism by splitting up the summation; This allows smaller C tensors to fill up larger GPUs.

WorkGroupMapping
Changes which work-groups operate on which tiles of tensor C. This can help performance by improving caching.