You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Apr 28, 2023. It is now read-only.
promotionImprovesCoalescing: use partial schedule instead of full
The check whether the promotion to shared memory improves coalescing is
performed by looking at the schedule dimension that is mapped to CUDA
thread x. The existing implementation relies on a so called "full
schedule" that contains all schedule dimensions. In practice, the
partial schedule until the dimension mapped to thread x is sufficient.
Compute thie partial schedule inside of promotionImprovesCoalescing
instead of precopmuting the "full schedule" externally.
0 commit comments