You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Summary:
Pull Request resolved: pytorch#4416
X-link: facebookresearch/FBGEMM#1488
Hoist memory loads from the outer loop
Intention is to prevent these loads from displacing cache lines, as they may contain matrix data.
Similarly, the loads are likely to inccur in cache misses after the first iteration. Executing the inner loop will probably fill the cache with matrix data.
Benchmarks repeatedly show a throughput improvement of around 1%.
before:
P1854747253
after:
P1854747141
Reviewed By: YifanYuan3
Differential Revision: D77459967
fbshipit-source-id: 01eb4fc004ba055823551d843f7bf7728caa74a8
0 commit comments