-
Notifications
You must be signed in to change notification settings - Fork 0
Generate linear and quadratic constraints from padded sumcheck proof #39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
src/sumcheck/constraints.rs
Outdated
| ); | ||
|
|
||
| // Transcripts should have received the same sequence of writes. | ||
| assert!(proof_transcript.compare_state(&constraint_transcript)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This assertion is failing, so I must be missing a write to the transcript somewhere in the constraint generator. Or perhaps I'm wrong to expect the two transcripts to be in the same state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see why this assertion fails: on the sumcheck prover side, we write all inputs (public and private) to the transcript after doing the 3.1.3 transcript stuff. If I remove that write from Proof::new, then the transcript states sync up as expected.
So this means we have to resolve the inconsistency between sumcheck/prover.h and zk/zk_prover.h so that the proof test vector won't incorporate private inputs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For now I refactored the sumcheck prover so we can toggle whether it runs in zk_test.cc compat mode, enabling both tests to pass.
| for (layer_index, (circuit_layer, proof_layer)) in | ||
| circuit.layers.iter().zip(proof.layers.iter()).enumerate() | ||
| { | ||
| // Choose alpha and beta for this layer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A bunch of this is exactly the same as in Proof::new, and could be factored out into something akin go longfellow-zk's begin_layer and end_layer methods. But then again it's nice to surface everything that's going on, and enable readers to compare it to the specification.
d6c6f7f to
6defa7e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
src/sumcheck/constraints.rs
Outdated
| for term in &mut constraints.linear_constraint_lhs { | ||
| if term.constraint_number == layer_index { | ||
| term.constant_factor *= | ||
| FE::lagrange_basis_polynomial_i(FE::ONE, challenge[0]); | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think mutating constraints after adding their constituent parts to the data structure makes this harder to follow and line up with the spec. What if we accumulated all the coefficients we need in a temporary data structure, and then only added the constraint after we had looped through all sumcheck rounds, so that we have computed all the known parts?
The per-layer constraint is claim = Q * VL * VR, where claim depends on all the polynomial evaluation pad values for this layer, plus the VL and VR pad values from the last layer, if applicable. Likewise, VL and VR depend on the VL and VR pad values from this layer, since they are the sum of the respective fields of the proof and pad. So, we could build up a representation of the symbolic claim expression as a dense list of 4*logw coefficients for polynomial evaluation pad values, plus 2 coefficients for VL/VR pad values, plus a constant term, to track the claim through the two nested loops. Then, we could combine that, along with a few more expressions for the VL and VR of the current layer, and add the whole layer constraint in one go. Note that we'd only need to compute and use the witness index values at the end as well, when adding to constraints. Does this sound like a useful refactoring?
src/sumcheck/symbolic.rs
Outdated
| #[derive(Debug, Clone, PartialEq, Eq)] | ||
| pub struct SymbolicExpression<FieldElement> { | ||
| constraint_number: usize, | ||
| terms: Vec<SymbolicTerm<FieldElement>>, | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI, a potential footgun: the derived equality comparisons would say that two SymbolicExpression values are different, even if the mathematical expressions they represent are equal, if they distribute the constant term differently between the known fields of different SymbolicTerms. I assume we're only using the PartialEq implementation in tests, and they're passing, so this may not matter in practice. (Putting my optimization hat on, having only one FieldElement for the constant term in an expression would save memory)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, we could say the same about how PartialEq is sensitive to the ordering of symbolic terms too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I only derived PartialEq, Eq here out of reflex, because I generally find that I want all four of these traits on every struct I define. But nothing breaks if we remove PartialEq, Eq: the tests check the result of the lhs_terms() and known() methods. So I removed the derived traits to avoid the semantic ambiguity you pointed out.
Putting my optimization hat on, having only one FieldElement for the constant term in an expression would save memory
Good call. I was able to achieve this by refactoring a bit more. SymbolicExpression now sums known parts into a single field element, and represents the symbolic parts using a new Symbolic struct, private to the symbolic module, so there's no change to the users.
src/sumcheck/witness.rs
Outdated
| /// - polynomials at each layer (2 * 2 * logw elements per circuit layer) | ||
| /// - vl, vr and vl * vr for each layer of the circuit (three elements per circuit layer) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clarification:
| /// - polynomials at each layer (2 * 2 * logw elements per circuit layer) | |
| /// - vl, vr and vl * vr for each layer of the circuit (three elements per circuit layer) | |
| /// - one-time-pad for polynomials at each layer (2 * 2 * logw elements per circuit layer) | |
| /// - one-time-pad for vl, vr and vl * vr for each layer of the circuit (three elements per circuit layer) |
src/sumcheck/constraints.rs
Outdated
| final_claim += SymbolicTerm::from_known( | ||
| claims[0] + gamma * claims[1] | ||
| - public_inputs | ||
| .iter() | ||
| .zip(eq2.iter()) | ||
| .fold(FE::ZERO, |sum, (public_input_i, eq2_i)| { | ||
| sum + *public_input_i * eq2_i | ||
| }), | ||
| ); | ||
|
|
||
| constraints | ||
| .linear_constraint_lhs | ||
| .extend(final_claim.lhs_terms()); | ||
| constraints.linear_constraint_rhs.push(final_claim.known()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it could be misleading to put this field element in the final_claim known part, then put it in the constraint RHS. While the end result is correct, the semantics of final_claim just before we destructure it don't match up with what the protocol is actually doing. The SymbolicExpression represents some final_claim.known() before adding it to the RHS, or we could just track the RHS separately, giving it its own name.
| final_claim += SymbolicTerm::from_known( | |
| claims[0] + gamma * claims[1] | |
| - public_inputs | |
| .iter() | |
| .zip(eq2.iter()) | |
| .fold(FE::ZERO, |sum, (public_input_i, eq2_i)| { | |
| sum + *public_input_i * eq2_i | |
| }), | |
| ); | |
| constraints | |
| .linear_constraint_lhs | |
| .extend(final_claim.lhs_terms()); | |
| constraints.linear_constraint_rhs.push(final_claim.known()); | |
| let rhs = ( | |
| claims[0] + gamma * claims[1] | |
| - public_inputs | |
| .iter() | |
| .zip(eq2.iter()) | |
| .fold(FE::ZERO, |sum, (public_input_i, eq2_i)| { | |
| sum + *public_input_i * eq2_i | |
| }), | |
| ); | |
| constraints | |
| .linear_constraint_lhs | |
| .extend(final_claim.lhs_terms()); | |
| constraints.linear_constraint_rhs.push(rhs); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this is because here we're computing the final claim:
SUM_{i} (eq2[i + npub] * sym_private_inputs[i])
- sym_layer_pad.vl
- gamma * sym_layer_pad.vr
=
- SUM_{i} (eq2[i] * public_inputs[i])
+ claims[0]
+ gamma * claims[1]
...and the spec has already done the work for us of rearranging everything into an LHS of purely symbolic parts and an RHS of purely known parts. I think it'd fit more neatly into the SymbolicExpression abstraction if it were expressed as
SUM_{i} (eq2[i + npub] * sym_private_inputs[i]) - SUM_{i} (eq2[i] * public_inputs[i])
+ claims[0] - sym_layer_pad.vl
+ gamma * claims[1] - gamma * sym_layer_pad.vr
= 0
...but then it'd be harder to line up our implementation with the specification, and I think that has a lot of value. It's possible that we could change the spec's descriptions of the per-layer and final constraints so that there's consistency across how they're specified and how we implement them. For now I'll go with your suggestion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The SymbolicExpression refactoring is great, well done. Feels way more maintainable, too.
The new outstanding issues look fine.
I think this is is good to go. Thanks for letting us tear into it so thoroughly, I'm pleased with the result!
f2c1da5 to
4be42ea
Compare
Implements generation of linear and quadratic constraints from a padded sumcheck proof, specified in 6.6 [1] and 4.4 [2]. - `fields`: for efficient computation of Lagrange basis polynomials, we add trait `LagrangePolynomialFieldElement`, which defines methods for the basis polynomial denomimators and methods for evaluating basis polynomials 0, 1, and 2. - `sumcheck/witness`: computes the layout of the witness vector, needed for symbolic manipulation of quantities in constraints and eventually for construction of Ligero commitment and proof. - `sumcheck/symbolic`: allows computing symbolic expressions consisting of terms of the form `a + bx`, where `a` is a known quantity, `x` is some unknown element of the witness vector and `b` is a constant scale factor. - `sumcheck/bind`: implements the `bindeq` function of 6.2 [3]. - `sumcheck/mod`: sumcheck prover now emits the witness vector and Fiat-Shamir transcript so they may be used to validate constraints. - `sumcheck/constraints`: implements the constraint generation of 6.6. In order to valiate the correctness of all this, we also modify `longfellow-zk` to emit a Ligero commitment and constraints when running `zk_test.cc` [4]. More work on test vector generation from C++ is needed. [1]: https://datatracker.ietf.org/doc/html/draft-google-cfrg-libzk-01#section-4.4 [2]: https://datatracker.ietf.org/doc/html/draft-google-cfrg-libzk-01#section-6.6 [3]: https://datatracker.ietf.org/doc/html/draft-google-cfrg-libzk-01#section-6.2 [4]: https://github.com/tgeoghegan/longfellow-zk/tree/constraint-test-vector
4be42ea to
61bb982
Compare

We can't be certain that this works yet, because we don't yet have the Ligero prover set up which would produce a serialized proof we could compare against a test vector. Still, we can start code review. In parallel, I'm going to work on:
sumcheck::Proof::newalso emit the actual witness vector, so that we can extend the test onsumcheck::constraints::ProofConstraintsto check that the constraints it generated are consistent with witness values(c, j, k)triples for linear LHS, the b vector for linear RHS and a vector of(x, y, z)triples for the quad constraints) so we can test against thatI also suspect that in the near future, we'll want to move some modules or items out of the
sumcheckmodule, because things like constraints and the witness layout will also be used by the Ligero prover and verifier.longfellow-zkhas some of this stuff in azk-commonmodule, which could work for us.