diff --git a/yellow-paper/docs/cryptography/_category_.json b/yellow-paper/docs/cryptography/_category_.json new file mode 100644 index 00000000000..ee3fba3fe8d --- /dev/null +++ b/yellow-paper/docs/cryptography/_category_.json @@ -0,0 +1,8 @@ +{ + "label": "Cryptography", + "position": 10, + "link": { + "type": "generated-index", + "description": "Aztec cryptography tech stack" + } +} diff --git a/yellow-paper/docs/cryptography/images/proof-system-components.png b/yellow-paper/docs/cryptography/images/proof-system-components.png new file mode 100644 index 00000000000..b2fcee21f4e Binary files /dev/null and b/yellow-paper/docs/cryptography/images/proof-system-components.png differ diff --git a/yellow-paper/docs/cryptography/performance-targets.md b/yellow-paper/docs/cryptography/performance-targets.md new file mode 100644 index 00000000000..4213b51486f --- /dev/null +++ b/yellow-paper/docs/cryptography/performance-targets.md @@ -0,0 +1,193 @@ +# Honk targets and win conditions + +## Introduction & context + +Aztec's cryptography tech stack and its associated implementation is an open-ended project with potential for many enhancements, optimisations and scope-creep. + +This document is designed to definitively answer the following questions: + +1. What are the metrics we care about when measuring our cryptography components? +1. What are minimum satisfiable values for these metrics? +1. What are the aspirational values for these metrics? + +# Important Metrics + +The following is a list of the relevant properties that affect the performance of the Aztec network: + +* Size of a user transaction (in kb) +* Time to generate a user transaction proof +* Memory required to generate a user transaction proof +* Time to generate an Aztec Virtual Machine proof +* Memory required to generate an Aztec Virtual Machine proof +* Time to compute a 2-to-1 rollup proof +* Memory required to compute a 2-to-1 rollup proof + + + + "MVP" = minimum standards that we can go to main-net with. + +Note: gb = gigabytes (not gigabits, gigibits or gigibytes) + +| metric | how to measure | MVP (10tps) | ideal (100tps) | +| --- | --- | --- | --- | +| proof size | total size of a user tx incl. goblin plonk proofs | 80kb | 8kb | +| prover time | A baseline "medium complexity" transaction (in web browser). Full description further down | 1 min | 10 seconds | +| verifier time | how long does it take the verifier to check a proof (incl. grumpkin IPA MSMs) | 20ms | 1ms | +| client memory consumption | fold 2^19 circuits into an accumulator an arbitrary number of times | 4gb | 1gb | +| size of the kernel circuit | number of gates | 2^17 | 2^15 | +| Aztec Virtual Machine prover time | 10,000 step VM circuit | 15 seconds | 1.5 seconds | +| Aztec Virtual Machine memory consumption | 1 million VM step circuit | 128gb | 16gb | +| 2-to-1 rollup proving time | 1 2-to-1 rollup proof | 7.4 seconds | 0.74 seconds | +| 2-to-1 rollup memory consumption | 1 2-to-1 rollup proof | 128gb | 16gb | + +To come up with the above estimates, we are targetting 10 transactions per second for the MVP and 100 tps for the "ideal" case. We are assuming both block producers and rollup Provers have access to 128-core machines with 128gb of RAM. Additionally, we assume that the various process required to produce a block consume the following: + +| process | percent of block production time allocated to process | +| --- | --- | +| transaction validation | 10% | +| block building (tx simulation) | 20% | +| public VM proof construction time | 20% | +| rollup prover time | 40% | +| UltraPlonk proof compression time | 10% | + +These are very rough estimates that could use further evaluation and validation! + +### Proof size + +The MVP wishes to target a tx through put of 10 tx per second. + +Each Aztec node (not sequencer/prover, just a regular node that is sending transactions) needs to download `10*proof_size` bytes of data to keep track of the mempool. However, this is the *best case* scenario. + +More practically, the data throughput of a p2p network will be less than the bandwidth of participants due to network coordination costs. +As a rough heuristic, we assume that network bandwidth will be 10% of p2p user bandwidth. +NOTE: can we find some high-quality information about p2p network throughput relative to the data consumed by p2p node operators? + +As a result, the MVP data throughput could scale up to `100 * proof_size` bytes of data per second. + +For an MVP we wish to target a maximum bandwidth of 8MB per second (i.e. a good broadband connection). This gives us a network bandwidth of 0.8MB/s. + +This sets the proof size limit to 819.2 kb per second per 100 transactions => 82 kilobytes of data per transaction. + +As a rough estimate, we can assume the non-proof tx data will be irrelevant compared to 82kb, so we target a proof size of $80$ kilobytes for the MVV. + +To support 100 transactions per second we would require a proof size of $8$ kilobytes. + +### Prover time + +The critical UX factor. To measure prover time for a transaction, we must first define a baseline transaction we wish to measure and the execution environment of the Prover. + +As we build+refine our MVP, we want to avoid optimising the best-case scenario (i.e. the most basic tx type, a token transfer). Instead we want to ensure that transactions of a "moderate" complexity are possible with consuer hardware. + +As a north star, we consider a private swap, and transpose it into an Aztec contract. + +To perform a private swap, the following must occur: + +1. Validate the user's account contract (1 kernel call) +2. Call a swap contract (1 kernel call) +3. The swap contract will initiate `transfer` calls on two token contracts (2 kernel calls) +4. A fee must be paid via our fee abstraction spec (1 kernel call) +5. A final "cleanup" proof is generated that evaluates state reads and processes the queues that have been constructed by previous kernel circuits (1 kernel call + 1 function call; the cleanup proof) + +In total we have 6 kernel calls and 6 function calls. + +We can further abstract the above by making the following assumption: + +1. The kernel circuit is $2^{17}$ constraints +2. The average number of constraints per function call is $2^{17}$ constraints, but the first function called has $2^{19}$ constraints + +Defining the first function to cost $2^{19}$ constraints is a conservative assumption due to the fact that the kernel circuit can support functions that have a max of $2^{19}$ constraints. We want to ensure that our benchmarks (and possible optimisations) capture the "heavy function" case and we don't just optimise for lightweight functions. + +#### Summary of what we are measuring to capture Prover time + +1. A mock kernel circuit has a size of $2^{17}$ constraints and folds *two* Honk instances into an accumulator (the prev. kernel and the function being called) +2. The Prover must prove 5 mock function circuit proofs of size $2^{17}$ and one mock function proof of size $2^{19}$ +3. The Prover must iteratively prove 6 mock kernel circuit proofs + +#### Execution environment + +For the MVP we can assume the user has reasonable hardware. For the purpose we use a 2-year-old macbook with 16gb RAM. The proof must be generated in a web browser + +#### Performance targets + +For an MVP, we target a 1 minute proof generation time. This is a substantial amount of time to ask a user to wait and we are measuring on good hardware. + +In an ideal world, a 10 second proof generation time would be much better for UX. + +### Verifier time + +This matters because verifying a transaction is effectively free work being performed by sequencers and network nodes that propagate txns to the mempool. If verification time becomes too large it opens up potential DDOS attacks. + +If we reserve 10% of the block production time for verifying user proofs, at 10 transaction per seconds this gives us 0.01s per transaction. i.e. 10ms per proof. + +If the block producer has access to more than one physical machine that they can use to parallelise verification, we can extend the maximum tolerable verification time. For an MVP that requires 20ms to verify each proof, each block producer would require at least 2 physical machines to successfully build blocks. + +100tps with one physical machine would require a verifiation time of 1ms per proof. + +### Memory consumption + +This is *critical*. Users can tolerate slow proofs, but if Honk consumes too much memory, a user cannot make a proof at all. + +safari on iPhone will purge tabs that consume more than 1gb of RAM. The WASM memory cap is 4gb which defines the upper limit for an MVP. + +### Kernel circuit size + +Not a critical metric, but the prover time + prover memory metrics are predicated on a kernel circuit costing about 2^17 constraints! + +### AVM Prover time + +Our goal is to hit main-net with a network that can support 10 transactions per second. We need to estimate how many VM computation steps will be needed per transaction to determine the required speed of the VM Prover. The following uses very conservative estimations due to the difficulty of estimating this. + + +An Ethereum block consists of approximately 1,000 transactions, with a block gas limit of roughly 10 million gas. Basic computational steps in the Ethereum Virtual Machine consume 3 gas. If the entire block gas limit is consumed with basic computation steps (not true but let's assume for a moment), this implies that 1,000 transactions consume 3.33 million computation steps. i.e. 10 transactions per second would require roughly 33,000 steps per second and 3,330 steps per transaction. + +As a conservative estimate, let us assume that every tx in a block will consume 10,000 AVM steps. + +Our AVM model is currently to evaluate a transaction's public function calls within a single AVM circuit. +This means that a block of `n` transactions will require `n` pulic kernel proofs and `n` AVM proofs to be generated (assuming all txns have a public component). + +If public VM proof construction consumes 20% of block time, we must generate 10 AVM proofs and 10 public kernel proofs in 2 seconds. + +When measuring 2-to-1 rollup prover time, we assume we have access to a Prover network with 500 physical devices available for computation. + +i.e. 10 proofs in 2 seconds across 500 devices => 1 AVM + public kernel proof in 25 seconds per physical device. + +If we assume that ~10 seconds is budgeted to the public kernel proof, this would give a 15 second prover time target for a 10,000 step AVM circuit. + +100 tps requires 1.5 seconds per proof. + + + +### AVM Memory consumption + +A large AWS instance can consume 128Gb of memory which puts an upper limit for AVM RAM consumption. Ideally consumer-grade hardware can be used to generate AVM proofs i.e. 16 Gb. + +### 2-to-1 rollup proving time + +For a rollup block containing $2^d$ transactions, we need to compute 2-to-1 rollup proofs across $d$ layers (i.e. 2^{d-1} 2-to-1 proofs, followed by 2^{d-2} proofs, followed by... etc down to requiring 1 2-to-1 proof). To hit 10tps, we must produce 1 block in $\frac{2^d}{10}$ seconds. + +Note: this excludes network coordination costs, latency costs, block construction costs, public VM proof construction costs (must be computed before the 2-to-1 rollup proofs), cost to compute the final UltraPlonk proof. + +To accomodate the above costs, we assume that we can budget 40% of block production time towards making proofs. Given these constraints, the following table describes maximum allowable proof construction times for a selection of block sizes. + +| block size | number of successive 2-to-1 rollup proofs | number of parallel Prover machines required for base layer proofs | time required to construct a rollup proof | +| --- | --- | --- | --- | +| $1,024$ | $10$ | $512$ | 4.1s | +| $2,048$ | $11$ | $1,024$ | 7.4s | +| $4,096$ | $12$ | $2,048$ | 13.6s | +| $8,192$ | $13$ | $4,096$ | 25.2s | +| $16,384$ | $14$ | $8,192$ | 46.8s | + +We must also define the maximum number of physical machines we can reasonably expect to be constructing proofs across the Prover network. If we can assume we can expect $1,024$ machines available, this caps the MVP proof construction time at 7.4 seconds. + +Supporting a proof construction time of 4.1s would enable us to reduce minimum hardware requirements for the Prover network to 512 physical machines. + +### 2-to-1 rollup memory consumption + +Same rationale as the public VM proof construction time. + diff --git a/yellow-paper/docs/cryptography/protocol-overview.md b/yellow-paper/docs/cryptography/protocol-overview.md new file mode 100644 index 00000000000..c2681cc9b10 --- /dev/null +++ b/yellow-paper/docs/cryptography/protocol-overview.md @@ -0,0 +1,125 @@ +# Proving System Components + +# Interactive Proving Systems + +## Ultra Plonk + +UltraPlonk is a variant of the [PLONK](https://eprint.iacr.org/2019/953) protocol - a zkSNARK with a universal trusted setup. + +UltraPlonk utilizes the "Ultra" circuit arithmetisation. This is a configuration with four wires per-gate, and the following set of gate types: + +- arithmetic gate +- elliptic curve point addition/doubling gate +- range-check gate +- plookup table gate +- memory-checking gates +- non-native field arithmetic gates + +## Honk + +Honk is a variant of the PLONK protocol. Plonk performs polynomial testing via checking a polynomial relation is zero modulo the vanishing polynomial of a multiplicative subgroup. Honk performs the polynomial testing via checking, using a sumcheck protocol, that a relation over multilinear polynomials vanishes when summed over a boolean hypercube. + +The first protocol to combine Plonk and the sumcheck protocol was [HyperPlonk](https://eprint.iacr.org/2022/1355) + +Honk uses a custom arithmetisation that extends the Ultra circuit arithmetisation (not yet finalized, but includes efficient Poseidon2 hashing) + +# Incrementally Verifiable Computation Subprotocols + +An Incrementally Verifiable Computation (IVC) scheme describes a protocol that defines some concept of persistent state, and enables multiple successive proofs to evolve the state over time. + +IVC schemes are used by Aztec in two capacities: + +1. to compute a client-side proof of one transaction execution. +2. to compute a proof of a "rollup" circuit, that updates rollup state based on a block of user transactions + +Both use IVC schemes. Client-side, each function call in a transaction is a "step" in the IVC scheme. Rollup-side, aggregating two transaction proofs is a "step" in the IVC scheme. + +The client-side IVC scheme is substantially more complex than the rollup-side scheme due to performance requiremenmts. + +Rollup-side, each "step" in the IVC scheme is a Honk proof, which are recursively verified. As a result, no protoocols other than Honk are required to execute rollup-side IVC. + +We perform one layer of []"proof-system compression"](https://medium.com/aztec-protocol/proof-compression-a318f478d575) in the rollup. The final proof of block-correctness is constructed as a Honk proof. An UltraPlonk circuit is used to verify the correctness of the Honk proof, so that the proof that is verified on-chain is an UltraPlonk proof. +Verification gas costs are lower for UltraPlonk vs Honk due to the following factors: + +1. Fewer precomputed selector polynomials, reducing Verifier G1 scalar multiplications +2. UltraPlonk does not use multilinear polynomials, which removes 1 pairing from the Verifier, as well as O(logn) G1 scalar multiplications. + +The following sections list the protocol components required to implement client-side IVC. We make heavy use of folding schemes to build an IVC scheme. A folding scheme enables instances of a relation to be folded into a single instance of the original relation, but in a "relaxed" form. Depending on the scheme, restrictions may be placed on the instances that can be folded. + +The main two families of folding schemes are derived from the [Nova](https://eprint.iacr.org/2021/370) protocol and the [Protostar](https://eprint.iacr.org/2023/620) protocol respectively. + +## Protogalaxy + +The [Protogalaxy](https://eprint.iacr.org/2023/1106) protocol efficiently supports the ability to fold multiple Honk instances (describing different circuits) into the same accumulator. To constrast, the Nova/Supernova/Hypernova family of folding schemes assume that a single circuit is being repeatedly folded (each Aztec function circuit is a distinct circuit, which breaks this assumption). + +It is a variant of [Protostar](https://eprint.iacr.org/2023/620). Unlike Protostar, Protogalaxy enables multiple instances to be efficiently folded into the same accumulator instance. + +The Protogalaxy protocol is split into two subprotocols, each modelled as interactive protocols between a Prover and a Verifier. + +#### Protogalaxy Fold + +The "Fold" Prover/Verifier validates that `k` instances of a defined relation (in our case the Honk relation) have been correctly folded into an accumulator instance. + +#### Protogalaxy Decider + +The "Decider" Prover/Verifier validate whether an accumulator instance correctly satisfies the accumulator relation. The accumulator being satisfiable inductively shows that all instances that have been folded were satisfied as well. (additional protocol checks are required to reason about *which* instances have been folded into the accumulator. See the [IVC specification](https://hackmd.io/h0yTcOHiQWeeTXnxTQhTNQ?view) for more information. (note to zac: put this in the yellow paper!) + +## Goblin Plonk + +[Goblin Plonk](https://hackmd.io/@aztec-network/BkGNaHUJn/%2FGfNR5SE5ShyXXmLxNCsg3g) is a computation delegation scheme that improves Prover performance when evaluating complex algorithms. + +In the context of an IVC scheme, Goblin Plonk enables a Prover to defer non-native group operations required by a Verifier algorithm, across multiple recursive proofs, to a single step evaluated at the conclusion of the IVC Prover algorithm. + +Goblin Plonk is composed of three subcomponents: + +#### Transcript Aggregation Subprotocol + +This subprotocol aggregates deferred computations from two independent instances, into a single instance + +#### Elliptic Curve Virtual Machine (ECCVM) Subprotocol + +The ECCVM is a Honk circuit with a custom circuit arithmetisation, designed to optimally evaluate elliptic curve arithmetic computations that have been deferred. It is defined over the Grumpkin elliptic curve. + +#### Translator Subprotocol + +The Translator is a Honk circuit, defined over BN254, with a custom circuit arithmetisation, designed to validate that the input commitments of an ECCVM circuit align with the delegated computations described by a Goblin Plonk transcript commitment. + +## Plonk Data Bus + +When passing data between successive IVC steps, the canonical method is to do so via public inputs. This adds significant costs to an IVC folding verifier (or recursive verifier when not using a folding scheme). Public inputs must be hashed prior to generating Fiat-Shamir challenges. When this is performed in-circuit, this adds a cost linear in the number of public inputs (with unpleasant constants ~30 constraints per field element). + +The Data Bus protocol eliminates this cost by representing cross-step data via succinct commitments instead of raw field elements. + +The [Plonk Data Bus](https://aztecprotocol.slack.com/files/U8Q1VAX6Y/F05G2B971FY/plonk_bus.pdf) protocol enables efficient data transfer between two Honk instances within a larger IVC protocol. + +# Polynomial Commitment Schemes + +The UltraPlonk, Honk, Goblin Plonk and Plonk Data Bus protocols utilize Polynomial Interactive Oracle Proofs as a core component, thus requiring the use of polynomial commitment schemes (PCS). + +UltraPlonk and Honk utilize multilinear PCS. The Plonk Data Bus and Goblin Plonk also utilize univariate PCS. + +For multilinear polynomial commitment schemes, we use the [ZeroMorph](https://eprint.iacr.org/2023/917) protocol, which itself uses a univariate PCS as a core component. + +Depending on context we use the following two univariate schemes within our cryptography stack. + +## KZG Commitments + +The [KZG](https://www.iacr.org/archive/asiacrypt2010/6477178/6477178.pdf) polynomial commitment scheme requires a universal setup and is instantiated over a pairing-friendly elliptic curve. + +Computing an opening proof of a degree-$n$ polynomial requires $n$ scalar multiplications, with a constant proof size and a constant verifier time. + +## Inner Product Argument + +The [IPA](https://eprint.iacr.org/2019/1177.pdf) PCS has worse asymptotics than KZG but can be instantiated over non-pairing friendly curves. + +We utilize the Grumpkin elliptic curve as part of the Goblin Plonk protocol, where we utilize the curve cycle formed between BN254 and Grumpkin to translate expensiven on-native BN254 group operations in a BN254 circuit, into native group operations in a Grumpkin circuit. + +Computing an opening proof of a degree-$n$ polynomial requires $2n$ scalar multiplications, with a $O(logn)$ proof size and an $O(n)$ verifier time. + +To batch-verify multiple opening proofs, we use the technique articulated in the [Halo](https://eprint.iacr.org/2019/1021) protocol. To compute a proof of a single rollup block, only one linear-time PCS opening proof is verified despite multiple IPA proofs being generated as part of constructing the rollup proof. + +# Combined IVC + Proving System Protocol + +The following block diagrams describe the components used by the client-side and server-side Provers when computing client proofs and rollup proofs respectively. + +![proof-system-components](../cryptography/images/proof-system-components.png)