You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Apr 28, 2023. It is now read-only.
Add a prologue profile and uncheckedRun to benchmark checks
This allows properly timing and testing a layer written as multiple TCs.
In particular this will be used to implement the group normalization benchmark
in the following commit.
Group normalization currently performs better as 2 successive kernels.
The first kernel computes the moments and is tuned separately via
tc/benchmarks/moments.cc.
The best options object is then taken and injected in the prologue functions so
we can properly compare group normalization.
0 commit comments