Test vectors & benchmark for PCS #222

tchataigner · 2024-01-02T07:15:46Z

Goal of this PR

The goal of this PR is to implement test vectors & benchmark for PCS.

Current progress

Implemented two generic test functions that we can use over any <Engine, EvaluationEngine<Engine>> for both the correct prove/verify flow and a bad proof verification flow.

Left TODO

Part of the issue is to tackle benchmark for proof size, proving time and verification time.

Related issue

This is part of tackling the issue #212

benches/pcs.rs

src/provider/util/mod.rs

src/provider/non_hiding_zeromorph.rs

* refactor: removed ROConstantsTrait - trait refactored to Default in associated type, - obviates the need for a separate trait, - refactored call sites and only implementation * fix: remove uneeded Sized bounds * fix: rename VanillaRO -> NativeRO

tchataigner · 2024-01-03T13:40:11Z

I think that I've got the bulk of the logic down.

A sample of what the benches look like can be found here: https://phnmlt.csb.app/

Now, I would like to iterate on what we actually display on the benchmark and how it should be formatted. Currently, the results are divided in two groups: proving and verifying. They are displayed per the power of two used to generate the polynomial we are testing over.

I'm not sure this fully corresponds to what we wanted. I started to look into the Throughput measurements to capture proof size but I'm not sure on how to capture a proof size in a generic manner.

huitseeker

This mostly looks great, I left some inline comments for simplification.
A rebase would be great, as it would allow us to see improvements from #216.

benches/recursive-snark.rs

huitseeker · 2024-01-03T14:06:51Z

src/spartan/polys/multilinear.rs

+    let poly = Self::new(
+      std::iter::from_fn(|| Some(Scalar::random(&mut rng)))
+        .take(1 << num_vars)
+        .collect(),
+    );


This should use random to avoid code duplication.

huitseeker · 2024-01-03T14:44:52Z

benches/pcs.rs

+  }
+}
+
+fn deterministic_assets<E: Engine, EE: EvaluationEngineTrait<E>, R: CryptoRng + RngCore>(


The name throws me off: if you have an &mut rng as an argument, are you creating something deterministic?

(I notice you are using this with a seeded rng, but the determinism isn't an invariant upheld by your function, rather by your usage of it)

You're right that the naming here is off. I specified it at deterministic... because if was a pre-determined polynomial size used to generate the assets. I've the implementation to be over BenchAssets and the name of the method to from_num_vars. should be more explicit this way.

huitseeker · 2024-01-03T14:45:15Z

src/spartan/polys/multilinear.rs

@@ -73,6 +77,27 @@ impl<Scalar: PrimeField> MultilinearPolynomial<Scalar> {
    )
  }

+  /// Returns a random polynomial, a point and calculate its evaluation.
+  pub fn random_with_eval<R: RngCore + CryptoRng>(


The use case for generating a random polynomial, a random point, and the corresponding evaluation is pretty narrow. So it may be quite confusing to see this in the public API.

Do we need this somewhere not in benchmark/testing code, or can we just move it to benchmark/testing code where it's needed?
If we need this both in benchmarks and testing, can we just inline this short function?

This was actually an update on my code based on a discussion with @adr1anh that pointed out to me that if we had a random we could have a random_with_eval.

What you're saying about it being a test function is true, and it is also the same for random. I could remove both from the public API and instead inline their code where we have some usage.

Would that be alright with you?

The use case for random is a bit larger than random_with_eval, so I'd tend to leave random, and move random_with_eval. I think my overall point is that as long as we don't add things to the public API of Arecibo, any decision we take is a 2-way door.

Moved random_with_eval only to eval :)

huitseeker · 2024-01-03T15:04:47Z

benches/pcs.rs

+
+// Macro to generate benchmark code for multiple engine and evaluation engine types
+macro_rules! benchmark_engines {
+    ($ell:expr, $rng:expr, $group:expr, $internal_fn:expr, $( ($engine:ty, $eval_engine:ty, $engine_name:expr) ),*) => {


You can use type_name! to avoid the need for the last parameter

huitseeker · 2024-01-03T15:06:09Z

benches/pcs.rs

+macro_rules! benchmark_engines {
+    ($ell:expr, $rng:expr, $group:expr, $internal_fn:expr, $( ($engine:ty, $eval_engine:ty, $engine_name:expr) ),*) => {
+        $(
+            let mut assets = deterministic_assets::<$engine, $eval_engine, StdRng>($ell, &mut $rng);


Once you populate the eval_engine, type inference should populate the first parameter. IOW, this should work and remove the need for the engine param:
deterministic_assets::<_, $eval_engine, StdRng>($ell, &mut $rng);

huitseeker · 2024-01-03T15:18:59Z

src/provider/util/mod.rs

+
+  /// Generates a random polynomial and point from a seed to test a proving/verifying flow of one
+  /// of our EvaluationEngine over a given Engine.
+  pub(crate) fn prove_verify_from_ell<E: Engine, EE: EvaluationEngineTrait<E>>(ell: usize) {


This function looks great, though I'm not sure about its name: ell isn't a standard nomenclature.
Here are things that would help me grok what this function is for out of context, and that you could use to edit the comment, the name of this function, or more likely both :

make it clear we're testing the prove and verify flow of Multilinear Polynomial Commitment Schemes (PCS),

which in Nova are represented by EvaluationEngineTrait implementations for proving,

replacing ell by the more evocative num_var, which becomes clear once the first bullet point has been stated clearly,

avoid the mention of an Engine, which is implied and selected by a choice of EvaluationEngineTrait implementation,

huitseeker · 2024-01-03T15:19:47Z

src/provider/non_hiding_zeromorph.rs

-      },
-    )
-    .unzip();
+      .zip_eq(offsets_of_x)


Weird that cargo fmt didn't complain with the original indentation!?

Yes pretty strange, I kept the update as it seemed ok

src/spartan/polys/multilinear.rs

huitseeker

One issue that only pops off the page now: at large values of ell, the generation of a proof is expensive, as you've no doubt noticed. Nonetheless, the output is completely deterministic.

One bad thing that's obscured by the benchmark_engines macro is that you're generating assets for a specific size (and EE choice), benching the proving, then re-generating those assets for the verification, and finally benching the verification. The assets could and should be shared between proving and verification I think.

huitseeker · 2024-01-03T15:51:04Z

benches/pcs.rs

+
+criterion_main!(pcs);
+
+const TEST_ELL: [usize; 11] = [10, 11, 12, 23, 14, 15, 16, 17, 18, 19, 20];


I'm not sure we need all points in the [10-20] interval. [10, 12, 14 ,16, 18, 20] seem amply sufficient to me.
/cc @adr1anh

Given that the protocols we are benchmarking are linear in ell, I think it's fair to sample fewer iterations as we should be able to interpolate the results to the odd sizes.

adr1anh · 2024-01-04T12:57:58Z

benches/pcs.rs

+
+criterion_main!(pcs);
+
+const TEST_ELL: [usize; 11] = [10, 11, 12, 23, 14, 15, 16, 17, 18, 19, 20];


Given that the protocols we are benchmarking are linear in ell, I think it's fair to sample fewer iterations as we should be able to interpolate the results to the odd sizes.

adr1anh · 2024-01-04T13:37:32Z

src/provider/util/mod.rs

+    if evaluate_bad_proof {
+      // Generate another point to verify proof. Also produce eval.
+      let altered_verifier_point = point
+        .iter()
+        .map(|s| s.add(<E as Engine>::Scalar::ONE))
+        .collect::<Vec<_>>();
+      let altered_verifier_eval =
+        MultilinearPolynomial::evaluate_with(poly.evaluations(), &altered_verifier_point);


Instead of running the verification twice, you can do something like

let point = if evaluate_bad_proof { altered_verifier_point } else { point } // Verify proof, should fail. let mut verifier_transcript = E::TE::new(b"TestEval"); let res = EE::verify( &verifier_key, &mut verifier_transcript, commitment, &altered_verifier_point, &altered_verifier_eval, &proof, ); assert_eq!(res.is_err(), evaluate_bad_proof);

That's true, however this would also change how the method behave. Currently if evaluate_bad_proof is set to true it allows the method to test both a valid and a non valid proof. While your proposal makes it that I only test one of the two cases. The downside is that I would then have to call this helper method twice for unit tests, doubling the number of prove method call instead.

Even after this comment, if you still think I should go with your proposal I'll be happy to do it :)

storojs72

LGTM!

Obviously, it is hard to make some statements about performance without running benchmarks on variety of ELLs but I would may be consider specifying the ELL as a single digit via a command-line which can be useful while running single instance of the benchmark on developer's machine in order to get very rough information whether performance has been increased / decreased / left unchanged after some PR merged.

storojs72 · 2024-01-04T14:51:31Z

benches/pcs.rs

+    {
+      let mut proving_group = c.benchmark_group("PCS-Proving");
+      proving_group
+        .sampling_mode(SamplingMode::Flat)


As far as I understand, Flat is only suitable to use for very long-running benchmarks. Probably Auto can be more appropriate for the short-running ones, where ell is say 10 / 11 / 12. Does it make sense to split the benchmarks into short / long running categories?

Funny enough: in general the Flat sampling mode is fine for most cases ... except I looked at the sample distribution for this PR, and the standard deviation for ZM verification at ell = 10 doesn't look great. You're probably right @storojs72, good catch!

Though I would note one very simple way of addressing this is noting the bench runtimes, which are reasonable (<20s per ell setting, see the bench log on my M1 Mac at https://gist.github.com/huitseeker/015f07534f410e94d615684ecd86876f) re-run them without setting the sampling mode (defaulting to Auto), and see if the runtimes are still reasonable.

As we have removed all odds sizes we are only left with 10 and 12 from your example. I'm not sure that there is a need to split the code for 2 cases.

I would either make it Auto for all (might not be great for higher-end number of variables) or Flat (losing a bit of iterations on smaller number of variables but we should still get enough to have convincing results).

I'm explicitly suggesting Auto for all, and then revisiting the question if the overall bench runtime is larger than say 15 min.

I don't know why I couldn't see your comment when I put mine up. I'll update to Auto

storojs72 · 2024-01-04T14:51:53Z

benches/pcs.rs

+  )
+  .unwrap();
+
+  BenchAssests {


BenchAssets?

Assets used during our benchmark that are shared between the proving and verification steps. Useful to avoid having lengthy methods signatures.

huitseeker · 2024-01-04T15:17:33Z

Obviously, it is hard to make some statements about performance without running benchmarks on variety of ELLs but I would may be consider specifying the ELL as a single digit via a command-line which can be useful while running single instance of the benchmark on developer's machine in order to get very rough information whether performance has been increased / decreased / left unchanged after some PR merged.

A way to do this is to have the ELL array (currently a constant) be overridden by an environment variable. We can probably open an issue and tackle this after merging this PR.

test(ipa): proof verified test(ipa): switch from keccak to grumpkin transcript test(ipa): wip test IPA prove & verify chore: added Jetbrain config to gitignore

- Implemented test_fail_bad_proof that generates a test case from a seed to try and verify a non valid proof. - Implement both test_fail_bad_proof & test_from_seed for IPA over Grumpkin, MLKZG over Bn256 and ZM over Bn256

@adr1anh

- Created a benchmark for proof generation and verification for IPA over Grumpkin, MLKZG over Bn and ZM over Bn - multilinear.rs to public for benchmark purposes - Refactored generic test methods per @adr1anh comment

huitseeker

I've re-run the benches on the same machine and the end-to-end runtime stands at 00:16:30, which I deem acceptable.
This looks great @tchataigner, thank you so much!

@tchataigner

See lurk-lang#222, the commit_open_prove_verify test is subsumed by the subsequent unit test. @tchataigner was right from the start.

@tchataigner

See lurk-lang#222, the commit_open_prove_verify test is subsumed by the subsequent unit test. @tchataigner was right from the start.

@tchataigner

See lurk-lang#222, the commit_open_prove_verify test is subsumed by the subsequent unit test. @tchataigner was right from the start.

@tchataigner

See lurk-lang#222, the commit_open_prove_verify test is subsumed by the subsequent unit test. @tchataigner was right from the start.

@tchataigner

See #222, the commit_open_prove_verify test is subsumed by the subsequent unit test. @tchataigner was right from the start.

huitseeker reviewed Jan 2, 2024

View reviewed changes

benches/pcs.rs Outdated Show resolved Hide resolved

src/provider/util/mod.rs Outdated Show resolved Hide resolved

src/provider/non_hiding_zeromorph.rs Outdated Show resolved Hide resolved

huitseeker mentioned this pull request Jan 2, 2024

Saving scalar multiplications while computing 'UVKZGPCSCommitment' #216

Merged

tchataigner force-pushed the feature/test-ipa branch 2 times, most recently from 2ca3654 to fb570c5 Compare January 3, 2024 13:35

tchataigner requested a review from huitseeker January 3, 2024 13:35

tchataigner requested a review from adr1anh January 3, 2024 13:42

tchataigner marked this pull request as ready for review January 3, 2024 13:51

tchataigner force-pushed the feature/test-ipa branch from 7e1862b to 9fea337 Compare January 3, 2024 14:54

huitseeker reviewed Jan 3, 2024

View reviewed changes

adr1anh reviewed Jan 4, 2024

View reviewed changes

storojs72 reviewed Jan 4, 2024

View reviewed changes

tchataigner force-pushed the feature/test-ipa branch from 61a25a5 to 1293d9b Compare January 4, 2024 16:04

tchataigner requested review from storojs72, huitseeker and adr1anh January 4, 2024 16:06

tchataigner added 10 commits January 4, 2024 18:44

test(ipa): proof verified random seed

c96069e

test(ipa): proof verified test(ipa): switch from keccak to grumpkin transcript test(ipa): wip test IPA prove & verify chore: added Jetbrain config to gitignore

test(ipa): using generic fn over mlkzg, ipa & zm

11a4d53

- Implemented test_fail_bad_proof that generates a test case from a seed to try and verify a non valid proof. - Implement both test_fail_bad_proof & test_from_seed for IPA over Grumpkin, MLKZG over Bn256 and ZM over Bn256

ci(ipa): fixed xclippy

a2bbb55

feat(ipa): wip bench pcs

c394b90

feat(ipa): bench proof generation and verification

f8b66f8

- Created a benchmark for proof generation and verification for IPA over Grumpkin, MLKZG over Bn and ZM over Bn - multilinear.rs to public for benchmark purposes - Refactored generic test methods per @adr1anh comment

feat(ipa): proper polynomial sizes for test & sample size to 10

580f86b

feat(ipa): benchmark groups

e68dfc0

ci(ipa): fix xclippy

7169f82

test(ipa): added more polynomial sizes

6652dc7

test(ipa): renamed test method

374b562

tchataigner added 4 commits January 4, 2024 18:44

refactor(ipa): revamp based on review

fb2f67c

refactor(ipa): Flat -> Auto bench

27da178

refactor(ipa): random_with_eval inline

bd25092

ci(ipa): fix xclippy

452f95c

tchataigner force-pushed the feature/test-ipa branch from 181cb54 to 452f95c Compare January 4, 2024 17:44

huitseeker approved these changes Jan 4, 2024

View reviewed changes

tchataigner added this pull request to the merge queue Jan 5, 2024

Merged via the queue into lurk-lang:dev with commit ba29a22 Jan 5, 2024
7 checks passed

tchataigner deleted the feature/test-ipa branch January 5, 2024 13:32

huitseeker mentioned this pull request Jan 5, 2024

feat: create test vectors & benchmark for PCS #212

Closed

huitseeker mentioned this pull request Jan 26, 2024

Zeromorph and HyperKZG improvement (Arecibo backports) microsoft/Nova#301

Open

huitseeker added a commit to huitseeker/arecibo that referenced this pull request Jan 30, 2024

chore: remove subsumed test

dd52a72

See lurk-lang#222, the commit_open_prove_verify test is subsumed by the subsequent unit test. @tchataigner was right from the start.

huitseeker mentioned this pull request Jan 30, 2024

chore: remove subsumed test (easy) #286

Merged

huitseeker added a commit to huitseeker/arecibo that referenced this pull request Jan 30, 2024

chore: remove subsumed test

e0636b4

See lurk-lang#222, the commit_open_prove_verify test is subsumed by the subsequent unit test. @tchataigner was right from the start.

huitseeker added a commit to huitseeker/arecibo that referenced this pull request Feb 1, 2024

chore: remove subsumed test

ef2b33d

See lurk-lang#222, the commit_open_prove_verify test is subsumed by the subsequent unit test. @tchataigner was right from the start.

huitseeker added a commit to huitseeker/arecibo that referenced this pull request Feb 1, 2024

chore: remove subsumed test

2e6b008

See lurk-lang#222, the commit_open_prove_verify test is subsumed by the subsequent unit test. @tchataigner was right from the start.

github-merge-queue bot pushed a commit that referenced this pull request Feb 2, 2024

chore: remove subsumed test (#286)

1942595

See #222, the commit_open_prove_verify test is subsumed by the subsequent unit test. @tchataigner was right from the start.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Test vectors & benchmark for PCS #222

Test vectors & benchmark for PCS #222

tchataigner commented Jan 2, 2024 •

edited

Loading

tchataigner commented Jan 3, 2024 •

edited

Loading

huitseeker left a comment

huitseeker Jan 3, 2024

huitseeker Jan 3, 2024

tchataigner Jan 4, 2024

huitseeker Jan 3, 2024

tchataigner Jan 4, 2024

huitseeker Jan 4, 2024

tchataigner Jan 4, 2024

huitseeker Jan 3, 2024

huitseeker Jan 3, 2024

huitseeker Jan 3, 2024

huitseeker Jan 3, 2024

tchataigner Jan 4, 2024

huitseeker left a comment

huitseeker Jan 3, 2024 •

edited

Loading

adr1anh Jan 4, 2024

adr1anh Jan 4, 2024

adr1anh Jan 4, 2024

tchataigner Jan 4, 2024

storojs72 left a comment

storojs72 Jan 4, 2024

huitseeker Jan 4, 2024 •

edited

Loading

tchataigner Jan 4, 2024

huitseeker Jan 4, 2024

tchataigner Jan 4, 2024

storojs72 Jan 4, 2024

tchataigner Jan 4, 2024

huitseeker commented Jan 4, 2024

huitseeker left a comment


		criterion_main!(pcs);

		const TEST_ELL: [usize; 11] = [10, 11, 12, 23, 14, 15, 16, 17, 18, 19, 20];

Test vectors & benchmark for PCS #222

Test vectors & benchmark for PCS #222

Conversation

tchataigner commented Jan 2, 2024 • edited Loading

Goal of this PR

Current progress

Left TODO

Related issue

tchataigner commented Jan 3, 2024 • edited Loading

huitseeker left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

huitseeker left a comment

Choose a reason for hiding this comment

huitseeker Jan 3, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

storojs72 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

huitseeker Jan 4, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

huitseeker commented Jan 4, 2024

huitseeker left a comment

Choose a reason for hiding this comment

tchataigner commented Jan 2, 2024 •

edited

Loading

tchataigner commented Jan 3, 2024 •

edited

Loading

huitseeker Jan 3, 2024 •

edited

Loading

huitseeker Jan 4, 2024 •

edited

Loading