Replies: 1 comment
-
Tracking issue for naive implementation #1851 |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Feature description
Many zkVMs introduce custom AIRs to efficiently handle operations that would otherwise consume many cycles in the existing general-purpose virtual machine. These are commonly referred to as precompiles, accelerators, or builtins. In this issue, we’ll use the term precompiles, as it is currently the most widely recognized.
This issue outlines the motivation and design considerations for supporting precompiles in the Miden VM. Specifically, we aim to:
Motivating Use Cases
The primary motivation for precompiles is to enable efficient support for cryptographic primitives—particularly hashing and signature verification—that are computationally intensive when expressed in general-which are expensive to express directly in VM instructions. These operations are essential for:
Case study: SP1 precompile architecture
As an example, SP1 implements the following precompiles, whose chips have the following width.
It is important to note that
Overall though, the widths are a few order of magnitude wider than the current Miden VM design. Combined with a low degree bound/blowup factor, this leads to very large proofs and increased cost for a recursive verifier.
One important feature of the SP1 design that we should also consider and perhaps take further, is that precompiles should focus on small operations that can be called from the VM. This helps keep proofs small, as an increase in trace length has a much smaller verification cost increase compared to widening traces.
In order to better model the costs of these approaches, we would need to compare the relative costs between
This can lead to different situations
Signature verification
The main focus would be support for ECDSA signatures, as these are widely used across existing chains.
Elliptic curve-based signature schemes require support for group operations, which themselves operate on non-native 256/384 bit fields. We would in any case require a precompile for the latter, which would be implemented by decomposing these bignums into 8/16 bit limbs, and allowing operations modulo the prime characteristic.
There are two ways to implement the curve operations
The tradeoff between both techniques affects trace sizes. In the former, we would reduce the trace width relating to precompiles, at the expense of a longer VM trace due to the extra cycles required for each non-native field operation. The opposite happens in the second case.
One potential reason explaining the huge widths of the EC chips in SP1 is that they are storing all intermediary field elements in the chip. If these were instead stored in memory, the add/double chips could be narrower at the expense of a longer field operation chip.
In the context where the VM computes the group operation, we would still be able to apply non-deterministic tricks to efficiently evaluate curve operations. Moreover, this would avoid having to specify a chip for each curve, relying only on a generic non-native field chip. Once implemented it will likely be easier to implement support for EdDSA and BLS by specifying different moduli.
Hash function
Implementing both Keccak and SHA2 enables compatibility with a wide range of applications.
SHA2 is used in Bitcoin, in standardized signature schemes, and is also used by SP1 to commit to public inputs to their proofs. Including both precompiles for
extend
andcompress
requires the VM to perform book-keeping of the state, reducing the precompile to a small repeated operation that would be expensive to compute manually.Keccak is used in many places in Ethereum, but is also much more complex to arithmetize in an AIR due to the large number of bit-decompositions it requires. There may however be ways to express the circuit more succinctly, when taking into account a higher AIR constraint degree, or finding ways to layout over multiple columns. Specializing the constraints to Goldilocks could also improve trace width.
VM Interface
This section focuses on the interaction between the classic existing Miden VM and precompile-specific VM. We assume for now that they are separate VMs and we would create separate proofs for each of them, making use of recursive verification to obtain a single proof at the end.
Depending on the context we may want to either
For simplicity though, we will focus on the later case as this simplifies the recursion discussion by limiting it to a single recursive verifier call.
In order to call a precompile, we would introduce an opcode for the VM which could be called from any context. The idea is that the VM will produce a commitment to all the precompile calls made over the course of the program execution, and pass this commitment as public input to the precompile VM. The precompile VM would have to unpack the commitment and verify each of them using the specialized chiplets. This requires both VMs to hash the same data, though we will explore an idea which can avoid this by introducing a mechanism simplifying efficient data sharing between two different proofs.
To implement the above, the kernel would expose a procedure which would be called with
We'll refer to this as an instruction that can be verified by the precompile VM. The kernel maintains in memory a sponge from which we will squeeze a commitment to all instructions requested by the VM. At every precompile call, the kernel absorbs the instruction into the sponge. In the epilog of a VM program execution, the squeezed hash, along with the non-deterministic proof loaded from the advice are recursively verified. If the verification succeeds, it means that all precompile instructions were valid.
The above requires that the kernel knows the verification key for the precompile VM, which would describe what kind of instructions (and corresponding IDs) can be verified by the precompile VM. The easiest path, at least initially, would be to consider a monolithic precompile VM which supports all possible instructions. This is practical because we could simplify the recursive verification step by avoiding having to consider precompiles individually, at the expense of a more expensive verification procedure. There are likely ways this can be optimized, where we can consider the set of constraints for each precompile separately, so that the cost of verification is proportional to the the number of precompiles actually invoked.
As mentioned earlier, the precompile VM receives a hash of all instructions as public input. It must always include a RPO digest (or whatever sponge used by the VM to construct the commitment), and unpack it one by one, and sending the instructions to the relevant chip using a bus. Since the inputs to each precompile may have different sizes, the precompile chips will also be responsible for calling the hash chip to unpack the inputs before verifying their correctness.
The advantage of letting the VM generating a hash of all the precompile instructions is that it allows us to support this mechanism without having to actually implement the precompile VM. The idea is as follows: in the epilog of the VM, the kernel simply outputs the commitment to the precompile instruction as one of the public inputs of the VM proof. This proof is now incomplete, as it can only be verified when supplied with the instructions commitment. Therefore, we can wrap the STARK proof with a list of all the precompile inputs, from which the commitment can be derived. This enhanced proof can then be sent to a verifier who will, before verifying the STARK proof, iterate over all precompile inputs, verify them and absorb them into a sponge in order to derive the instructions commitment. While this does increase the proof size linearly with regard to the number of instructions, it also allows us to expose precompiles to users much faster, and only requires changes at the VM kernel level.
In an ideal world though, the VM would be able to call precompiles without having to hash the inputs, and instead share the inputs directly with the precompile VM. For this, we consider an approach which would be implemented at the prover level. In the same ways as we can share instruction between both VM using a commitment, we can equivalently commit to all instructions as a separate trace which is accessible by both VMs. The Miden VM would treat it as a read-only table, and a call to the precompile procedure would use a bus request to ensure the instruction is included in the table. On the precompile VM side, the constraints over this table would ensure that a bus request is made from each row to the relevant chip, such that this instruction table can be considered correct since all instructions have been verified by the relevant chiplets.
While including an addition trace would usually require an extra Merkle tree opening for the verifier when verifying FRI queries, we can avoid this by the following:
Moreover, this technique can also be used in the context of a batch precompile proof for multiple VM executions. The instructions table may contain instructions that are unused by an individual execution, but as long as all instructions have been verified, we are sure that all deferred instructions across the different VM proofs are valid.
The main use case for deferred proving is for helping reduce the amount of proving performed by clients, and instead let a network prover prove the expensive computations. However, there are many different scenarios that enable this. We will need to evaluate them based off of
Roadmap
Why is this feature needed?
Efficient verification of expensive operations (signature verification and hashes)
Beta Was this translation helpful? Give feedback.
All reactions