-
-
Notifications
You must be signed in to change notification settings - Fork 29
Open
Description
Both Invert
and Pow22523
repeatedly square in a loop. The overhead of repeatedly calling Square (and having to shuffling data in/out of registers) adds up to a decent chunk of execution time.
Doing something like func (v *Element) pow2k(x *Element, k uint)
with the precondition that k >= 1
, dramatically improves performance of the two operations like thus:
Invert-4 11.3µs ± 0% 7.1µs ± 0% -37.33%
Pow22523-4 11.1µs ± 0% 7.0µs ± 0% -37.32%
Numbers taken with purego
, but the amd64 assembly implementation will also benefit (and can be written without having to spill to the stack at all).
AlexanderYastrebov
Metadata
Metadata
Assignees
Labels
No labels