-
-
Notifications
You must be signed in to change notification settings - Fork 29
Open
Description
I'm not sure how bad the compiler behavior is on non-amd64 (due to lack of access to targets), and this only impacts non-amd64/arm64 (due to dedicated assembly), but golang/go#29571 is (also) costing you a good amount of performance.
Unfortunately having giant walls of math/bits
calls is less readable that the wrapper type, so depending on how you want to balance "readability" vs "going fast" this might be ok.
name old time/op new time/op delta
Add-4 24.1ns ± 0% 24.1ns ± 0% +0.17%
Multiply-4 135ns ± 0% 127ns ± 0% -6.13%
Mult32-4 26.1ns ± 0% 26.1ns ± 0% +0.11%
Square-4 102ns ± 0% 96ns ± 0% -5.84%
Invert-4 27.3µs ± 0% 25.8µs ± 0% -5.71%
name old time/op new time/op delta
MultiScalarMultSize8-4 1.23ms ± 0% 1.17ms ± 0% -4.26%
ScalarBaseMult-4 101µs ± 0% 96µs ± 0% -4.45%
ScalarMult-4 360µs ± 0% 342µs ± 0% -4.90%
VarTimeDoubleScalarBaseMult-4 351µs ± 0% 332µs ± 0% -5.46%
nb: Only did one iteration on an amd64 target with go 1.17beta1 + purego, so there's some noise in the comparison, but the difference is statistically significant and noticeable.
Metadata
Metadata
Assignees
Labels
No labels