Skip to content

Commit 09b1d46

Browse files
Merge #979: Native jacobi symbol algorithm
ce3cfc7 doc: Describe Jacobi calculation in safegcd_implementation.md (Elliott Jin) 6be0103 Add secp256k1_fe_is_square_var function (Pieter Wuille) 1de2a01 Native jacobi symbol algorithm (Pieter Wuille) 04c6c1b Make secp256k1_modinv64_det_check_pow2 support abs val (Pieter Wuille) 5fffb2c Make secp256k1_i128_check_pow2 support -(2^n) (Pieter Wuille) Pull request description: This introduces variants of the vartime divsteps-based GCD algorithm used for modular inverses to compute Jacobi symbols. Changes compared to the normal vartime divsteps: * Only positive matrices are used, guaranteeing that f and g remain positive. * An additional jac variable is updated to track sign changes during matrix computation. * There is (so far) no proof that this algorithm terminates within reasonable amount of time for every input, but experimentally it appears to almost always need less than 900 iterations. To account for that, only a bounded number of iterations is performed (1500), after which failure is returned. The field logic then falls back to using square roots to determining the result. * The algorithm converges to f=g=gcd(f0,g0) rather than g=0. To keep this test simple, the end condition is f=1, which won't be reached if started with g=0. That case is dealt with specially. This code is currently unused, except for tests. I don't aim for it to be merged until there is a need for it, but this demonstrates its feasibility. In terms of performance: ``` field_inverse: min 1.76us / avg 1.76us / max 1.78us field_inverse_var: min 0.991us / avg 0.993us / max 0.996us field_jacobi_var: min 1.31us / avg 1.31us / max 1.31us field_sqrt: min 4.36us / avg 4.37us / max 4.40us ``` while with the older (f24e122) libgmp based Jacobi code on the same system: ``` num_jacobi: min 1.53us / avg 1.54us / max 1.55us ``` ACKs for top commit: jonasnick: ACK ce3cfc7 real-or-random: reACK ce3cfc7 diff and writeup is good and I tested every commit Tree-SHA512: 8a6204a7a108d8802d942a54faca39917f90ea5923130683bbd870f9025f4ec8ef256ffa1d939a793f0b32d4cdfcdcd1d3f8ae5ed74a0193be7ad98362ce027e
2 parents cbd2555 + ce3cfc7 commit 09b1d46

13 files changed

+553
-40
lines changed

doc/safegcd_implementation.md

Lines changed: 50 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
# The safegcd implementation in libsecp256k1 explained
22

3-
This document explains the modular inverse implementation in the `src/modinv*.h` files. It is based
4-
on the paper
3+
This document explains the modular inverse and Jacobi symbol implementations in the `src/modinv*.h` files.
4+
It is based on the paper
55
["Fast constant-time gcd computation and modular inversion"](https://gcd.cr.yp.to/papers.html#safegcd)
66
by Daniel J. Bernstein and Bo-Yin Yang. The references below are for the Date: 2019.04.13 version.
77

@@ -769,3 +769,51 @@ def modinv_var(M, Mi, x):
769769
d, e = update_de(d, e, t, M, Mi)
770770
return normalize(f, d, Mi)
771771
```
772+
773+
## 8. From GCDs to Jacobi symbol
774+
775+
We can also use a similar approach to calculate Jacobi symbol *(x | M)* by keeping track of an
776+
extra variable *j*, for which at every step *(x | M) = j (g | f)*. As we update *f* and *g*, we
777+
make corresponding updates to *j* using
778+
[properties of the Jacobi symbol](https://en.wikipedia.org/wiki/Jacobi_symbol#Properties):
779+
* *((g/2) | f)* is either *(g | f)* or *-(g | f)*, depending on the value of *f mod 8* (negating if it's *3* or *5*).
780+
* *(f | g)* is either *(g | f)* or *-(g | f)*, depending on *f mod 4* and *g mod 4* (negating if both are *3*).
781+
782+
These updates depend only on the values of *f* and *g* modulo *4* or *8*, and can thus be applied
783+
very quickly, as long as we keep track of a few additional bits of *f* and *g*. Overall, this
784+
calculation is slightly simpler than the one for the modular inverse because we no longer need to
785+
keep track of *d* and *e*.
786+
787+
However, one difficulty of this approach is that the Jacobi symbol *(a | n)* is only defined for
788+
positive odd integers *n*, whereas in the original safegcd algorithm, *f, g* can take negative
789+
values. We resolve this by using the following modified steps:
790+
791+
```python
792+
# Before
793+
if delta > 0 and g & 1:
794+
delta, f, g = 1 - delta, g, (g - f) // 2
795+
796+
# After
797+
if delta > 0 and g & 1:
798+
delta, f, g = 1 - delta, g, (g + f) // 2
799+
```
800+
801+
The algorithm is still correct, since the changed divstep, called a "posdivstep" (see section 8.4
802+
and E.5 in the paper) preserves *gcd(f, g)*. However, there's no proof that the modified algorithm
803+
will converge. The justification for posdivsteps is completely empirical: in practice, it appears
804+
that the vast majority of nonzero inputs converge to *f=g=gcd(f<sub>0</sub>, g<sub>0</sub>)* in a
805+
number of steps proportional to their logarithm.
806+
807+
Note that:
808+
- We require inputs to satisfy *gcd(x, M) = 1*, as otherwise *f=1* is not reached.
809+
- We require inputs *x &neq; 0*, because applying posdivstep with *g=0* has no effect.
810+
- We need to update the termination condition from *g=0* to *f=1*.
811+
812+
We account for the possibility of nonconvergence by only performing a bounded number of
813+
posdivsteps, and then falling back to square-root based Jacobi calculation if a solution has not
814+
yet been found.
815+
816+
The optimizations in sections 3-7 above are described in the context of the original divsteps, but
817+
in the C implementation we also adapt most of them (not including "avoiding modulus operations",
818+
since it's not necessary to track *d, e*, and "constant-time operation", since we never calculate
819+
Jacobi symbols for secret data) to the posdivsteps version.

src/bench_internal.c

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -219,6 +219,19 @@ static void bench_field_sqrt(void* arg, int iters) {
219219
CHECK(j <= iters);
220220
}
221221

222+
static void bench_field_is_square_var(void* arg, int iters) {
223+
int i, j = 0;
224+
bench_inv *data = (bench_inv*)arg;
225+
secp256k1_fe t = data->fe[0];
226+
227+
for (i = 0; i < iters; i++) {
228+
j += secp256k1_fe_is_square_var(&t);
229+
secp256k1_fe_add(&t, &data->fe[1]);
230+
secp256k1_fe_normalize_var(&t);
231+
}
232+
CHECK(j <= iters);
233+
}
234+
222235
static void bench_group_double_var(void* arg, int iters) {
223236
int i;
224237
bench_inv *data = (bench_inv*)arg;
@@ -371,6 +384,7 @@ int main(int argc, char **argv) {
371384
if (d || have_flag(argc, argv, "field") || have_flag(argc, argv, "mul")) run_benchmark("field_mul", bench_field_mul, bench_setup, NULL, &data, 10, iters*10);
372385
if (d || have_flag(argc, argv, "field") || have_flag(argc, argv, "inverse")) run_benchmark("field_inverse", bench_field_inverse, bench_setup, NULL, &data, 10, iters);
373386
if (d || have_flag(argc, argv, "field") || have_flag(argc, argv, "inverse")) run_benchmark("field_inverse_var", bench_field_inverse_var, bench_setup, NULL, &data, 10, iters);
387+
if (d || have_flag(argc, argv, "field") || have_flag(argc, argv, "issquare")) run_benchmark("field_is_square_var", bench_field_is_square_var, bench_setup, NULL, &data, 10, iters);
374388
if (d || have_flag(argc, argv, "field") || have_flag(argc, argv, "sqrt")) run_benchmark("field_sqrt", bench_field_sqrt, bench_setup, NULL, &data, 10, iters);
375389

376390
if (d || have_flag(argc, argv, "group") || have_flag(argc, argv, "double")) run_benchmark("group_double_var", bench_group_double_var, bench_setup, NULL, &data, 10, iters*10);

src/field.h

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -135,4 +135,7 @@ static void secp256k1_fe_half(secp256k1_fe *r);
135135
* magnitude set to 'm' and is normalized if (and only if) 'm' is zero. */
136136
static void secp256k1_fe_get_bounds(secp256k1_fe *r, int m);
137137

138+
/** Determine whether a is a square (modulo p). */
139+
static int secp256k1_fe_is_square_var(const secp256k1_fe *a);
140+
138141
#endif /* SECP256K1_FIELD_H */

src/field_10x26_impl.h

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1365,4 +1365,31 @@ static void secp256k1_fe_inv_var(secp256k1_fe *r, const secp256k1_fe *x) {
13651365
VERIFY_CHECK(secp256k1_fe_normalizes_to_zero(r) == secp256k1_fe_normalizes_to_zero(&tmp));
13661366
}
13671367

1368+
static int secp256k1_fe_is_square_var(const secp256k1_fe *x) {
1369+
secp256k1_fe tmp;
1370+
secp256k1_modinv32_signed30 s;
1371+
int jac, ret;
1372+
1373+
tmp = *x;
1374+
secp256k1_fe_normalize_var(&tmp);
1375+
/* secp256k1_jacobi32_maybe_var cannot deal with input 0. */
1376+
if (secp256k1_fe_is_zero(&tmp)) return 1;
1377+
secp256k1_fe_to_signed30(&s, &tmp);
1378+
jac = secp256k1_jacobi32_maybe_var(&s, &secp256k1_const_modinfo_fe);
1379+
if (jac == 0) {
1380+
/* secp256k1_jacobi32_maybe_var failed to compute the Jacobi symbol. Fall back
1381+
* to computing a square root. This should be extremely rare with random
1382+
* input (except in VERIFY mode, where a lower iteration count is used). */
1383+
secp256k1_fe dummy;
1384+
ret = secp256k1_fe_sqrt(&dummy, &tmp);
1385+
} else {
1386+
#ifdef VERIFY
1387+
secp256k1_fe dummy;
1388+
VERIFY_CHECK(jac == 2*secp256k1_fe_sqrt(&dummy, &tmp) - 1);
1389+
#endif
1390+
ret = jac >= 0;
1391+
}
1392+
return ret;
1393+
}
1394+
13681395
#endif /* SECP256K1_FIELD_REPR_IMPL_H */

src/field_5x52_impl.h

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -664,4 +664,31 @@ static void secp256k1_fe_inv_var(secp256k1_fe *r, const secp256k1_fe *x) {
664664
#endif
665665
}
666666

667+
static int secp256k1_fe_is_square_var(const secp256k1_fe *x) {
668+
secp256k1_fe tmp;
669+
secp256k1_modinv64_signed62 s;
670+
int jac, ret;
671+
672+
tmp = *x;
673+
secp256k1_fe_normalize_var(&tmp);
674+
/* secp256k1_jacobi64_maybe_var cannot deal with input 0. */
675+
if (secp256k1_fe_is_zero(&tmp)) return 1;
676+
secp256k1_fe_to_signed62(&s, &tmp);
677+
jac = secp256k1_jacobi64_maybe_var(&s, &secp256k1_const_modinfo_fe);
678+
if (jac == 0) {
679+
/* secp256k1_jacobi64_maybe_var failed to compute the Jacobi symbol. Fall back
680+
* to computing a square root. This should be extremely rare with random
681+
* input (except in VERIFY mode, where a lower iteration count is used). */
682+
secp256k1_fe dummy;
683+
ret = secp256k1_fe_sqrt(&dummy, &tmp);
684+
} else {
685+
#ifdef VERIFY
686+
secp256k1_fe dummy;
687+
VERIFY_CHECK(jac == 2*secp256k1_fe_sqrt(&dummy, &tmp) - 1);
688+
#endif
689+
ret = jac >= 0;
690+
}
691+
return ret;
692+
}
693+
667694
#endif /* SECP256K1_FIELD_REPR_IMPL_H */

src/int128.h

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -80,10 +80,10 @@ static SECP256K1_INLINE void secp256k1_i128_from_i64(secp256k1_int128 *r, int64_
8080
/* Compare two 128-bit values for equality. */
8181
static SECP256K1_INLINE int secp256k1_i128_eq_var(const secp256k1_int128 *a, const secp256k1_int128 *b);
8282

83-
/* Tests if r is equal to 2^n.
83+
/* Tests if r is equal to sign*2^n (sign must be 1 or -1).
8484
* n must be strictly less than 127.
8585
*/
86-
static SECP256K1_INLINE int secp256k1_i128_check_pow2(const secp256k1_int128 *r, unsigned int n);
86+
static SECP256K1_INLINE int secp256k1_i128_check_pow2(const secp256k1_int128 *r, unsigned int n, int sign);
8787

8888
#endif
8989

src/int128_native_impl.h

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -84,9 +84,10 @@ static SECP256K1_INLINE int secp256k1_i128_eq_var(const secp256k1_int128 *a, con
8484
return *a == *b;
8585
}
8686

87-
static SECP256K1_INLINE int secp256k1_i128_check_pow2(const secp256k1_int128 *r, unsigned int n) {
87+
static SECP256K1_INLINE int secp256k1_i128_check_pow2(const secp256k1_int128 *r, unsigned int n, int sign) {
8888
VERIFY_CHECK(n < 127);
89-
return (*r == (int128_t)1 << n);
89+
VERIFY_CHECK(sign == 1 || sign == -1);
90+
return (*r == (int128_t)((uint128_t)sign << n));
9091
}
9192

9293
#endif

src/int128_struct_impl.h

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -189,10 +189,11 @@ static SECP256K1_INLINE int secp256k1_i128_eq_var(const secp256k1_int128 *a, con
189189
return a->hi == b->hi && a->lo == b->lo;
190190
}
191191

192-
static SECP256K1_INLINE int secp256k1_i128_check_pow2(const secp256k1_int128 *r, unsigned int n) {
193-
VERIFY_CHECK(n < 127);
194-
return n >= 64 ? r->hi == (uint64_t)1 << (n - 64) && r->lo == 0
195-
: r->hi == 0 && r->lo == (uint64_t)1 << n;
192+
static SECP256K1_INLINE int secp256k1_i128_check_pow2(const secp256k1_int128 *r, unsigned int n, int sign) {
193+
VERIFY_CHECK(n < 127);
194+
VERIFY_CHECK(sign == 1 || sign == -1);
195+
return n >= 64 ? r->hi == (uint64_t)sign << (n - 64) && r->lo == 0
196+
: r->hi == (uint64_t)((sign - 1) >> 1) && r->lo == (uint64_t)sign << n;
196197
}
197198

198199
#endif

src/modinv32.h

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -35,4 +35,9 @@ static void secp256k1_modinv32_var(secp256k1_modinv32_signed30 *x, const secp256
3535
/* Same as secp256k1_modinv32_var, but constant time in x (not in the modulus). */
3636
static void secp256k1_modinv32(secp256k1_modinv32_signed30 *x, const secp256k1_modinv32_modinfo *modinfo);
3737

38+
/* Compute the Jacobi symbol for (x | modinfo->modulus). x must be coprime with modulus (and thus
39+
* cannot be 0, as modulus >= 3). All limbs of x must be non-negative. Returns 0 if the result
40+
* cannot be computed. */
41+
static int secp256k1_jacobi32_maybe_var(const secp256k1_modinv32_signed30 *x, const secp256k1_modinv32_modinfo *modinfo);
42+
3843
#endif /* SECP256K1_MODINV32_H */

0 commit comments

Comments
 (0)