Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: ADVZ PayloadProver support requests that span multiple polynomial #438

Merged
merged 10 commits into from
Dec 5, 2023

Conversation

ggutoski
Copy link
Contributor

@ggutoski ggutoski commented Nov 29, 2023

Description

closes: #424

Naive implementation:

  • SmallRangeProof: concatenate individual KZG proofs from multiple polynomials, do not aggregate
  • LargeRangeProof: rebuild and verify multiple KZG commitments. There's not much more we could do to improve here. But verification inside a smart contract is now more complex because it might need to compute multiple KZG commitments instead of one.

Before we can merge this PR, please make sure that all the following items have been
checked off. If any of the checklist items are not applicable, please leave them but
write a little note why.

  • Targeted PR against correct branch (main)
  • Linked to GitHub issue with discussion and accepted design OR have an explanation in the PR that describes this work.
  • Wrote unit tests
  • Updated relevant documentation in the code
  • Added a relevant changelog entry to the Pending section in CHANGELOG.md
  • Re-reviewed Files changed in the GitHub PR explorer

@chancharles92
Copy link
Contributor

  • LargeRangeProof: rebuild and verify multiple KZG commitments. There's not much more we could do to improve here. But verification inside a smart contract is now more complex because it might need to compute multiple KZG commitments instead of one.

Can the smart contract just generate a random scalar and randomly combine the field vectors? Finally, it only needs to compute a single KZG commitment and compare it with the random linear combinations of the original commitments.

@ggutoski
Copy link
Contributor Author

ggutoski commented Dec 4, 2023

Can the smart contract just generate a random scalar and randomly combine the field vectors? Finally, it only needs to compute a single KZG commitment and compare it with the random linear combinations of the original commitments.

Interesting idea. Basically it replaces subsequent MSMs with a faster hash of the field vectors.

In any case I don't want to do it in this PR. 😉

primitives/src/vid/advz/payload_prover.rs Show resolved Hide resolved
primitives/src/vid/advz/payload_prover.rs Outdated Show resolved Hide resolved
);
let range_poly_byte = self.range_poly_to_byte_clamped(&range_poly, payload.len());
let offset_elem = self.offset_poly_to_elem(range_poly.start, range_elem.start);
let final_points_range_end =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the case where the range goes across n >= 3 polys, are there any advantages of using SmallRangeProof?
My argument is that if the range covers at least n-2 >= 1 full polys, compared to LargeRangeProof, the proof size and the verification time of SmallRangeProof won't be much more efficient even if we use KZG multiproof in the future. This is because the proof size is at least the number of evaluations (even if the KZG proof is O(1)-sized), and the verification is more than O(n*|poly_deg|) G-ops. (Recall that the KZG multi-verification still needs O(|eval_points|) group operations.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...the proof size is at least the number of evaluations (even if the KZG proof is O(1)-sized)

Why is that? A SmallRangeProof with constant-size KZG proof is constant-sized. The remaining fields are all constant-size. (The length of prefix_bytes and suffix_bytes is always less than 32.)

...the verification is more than O(n*|poly_deg|) G-ops.

Yes but I wonder whether the n-factor could be eliminated by aggregating multiproofs across multiple polynomials as described here. But that's a cutting edge optimization that will need to wait until later.

In any case, I doubt there is any advantage to the use of SmallRangeProof for large ranges--that's what LargeRangeProof is for. If your range spans multiple polynomials then LargeRangeProof is probably a better choice.

Is there a terminology misunderstanding? "Small" here refers to the size of the range of payload bytes that will be proven, not the size of the proof itself. (ie. [small range] proof vs. small [range proof] 😛 ) We tried to choose names that generalize beyond the narrow tx-namespace application. The intention is that Small should be used for tx proofs (typically only a small number of bytes) in a setting where a pairing is feasible (such as a off-chain light client). By contrast, Large should be used for namespace proofs (typically a large number of bytes) in a setting where a pairing is impractical (such as on-chain light client). Perhaps we could change these names.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a terminology misunderstanding?

Oh the understanding of terminology is correct. My thought was: if range in SmallRangeProof::payload_proof() is not small (e.g., if it goes across >= 3 polynomials), the API should directly return an error and ask the user to use LargeRangeProof::payload_proof() instead (because there is no advantage of using SmallRangeProof::payload_proof() in this case). While in the current code, it is still dealing with this case and returns a valid but inefficient proof.

Why is that? A SmallRangeProof with constant-size KZG proof is constant-sized. The remaining fields are all constant-size. (The length of prefix_bytes and suffix_bytes is always less than 32.)

Because anyway you need to send the txs content to the verifier (my understanding of the so-called proof is data + auxiliary_proof). E.g., if the tx data is exactly a polynomial, in the LargeRangeProof case, you don't need to send any extra info (as the tx itself is already a proof); while here you need to additionally send an O(1)-sized KZG multiproof.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for clarifying.

My thought was: if range in SmallRangeProof::payload_proof() is not small (e.g., if it goes across >= 3 polynomials), the API should directly return an error and ask the user to use LargeRangeProof::payload_proof() instead (because there is no advantage of using SmallRangeProof::payload_proof() in this case). While in the current code, it is still dealing with this case and returns a valid but inefficient proof.

An alternative is to add another impl of PayloadProver that intelligently selects the proof type based on the user's input. The return type would be an enum with Small and Large variants. The existing impls for Small and Large remain as-is so that the user retains the ability to enforce a preference for Small or Large. (Example: perhaps the user always wants Large because it's pairing-free. I can't think of a reason to enforce Small at this time but you never know.)

... in the LargeRangeProof case, you don't need to send any extra info (as the tx itself is already a proof)

A LargeRangeProof needs enough auxiliary data to complete a polynomial. This auxiliary data could be quite large if for example the user's range just barely crosses into another polynomial. But yes, if the data range coincides with a polynomial then there's near-zero overhead.

Proposal

I propose we merge as-is and punt this question to a new issue. This discussion is about performance but our current goal is feature-completeness. Any opinions?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New issue #441

@chancharles92
Copy link
Contributor

chancharles92 commented Dec 4, 2023

Interesting idea. Basically it replaces subsequent MSMs with a faster hash of the field vectors.

In any case I don't want to do it in this PR. 😉

In the interactive smart contract setting, we can even generate the randomness via oracles without doing Fiat-Shamir. But yes, there's no need to implement that for now.

Copy link
Contributor

@chancharles92 chancharles92 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, will approve after addressing the comment in #438 (comment)

@ggutoski ggutoski merged commit 080ff32 into main Dec 5, 2023
3 checks passed
@ggutoski ggutoski deleted the gg/424 branch December 5, 2023 21:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ADVZ PayloadProver support requests that span multiple polynomial
2 participants