Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: added a UnivariateMonomial representation to reduce field ops in protogalaxy+sumcheck #10401

Merged
merged 34 commits into from
Dec 19, 2024

Conversation

zac-williamson
Copy link
Contributor

Summary:

client_ivc_bench.sh benchmark has been improved by approx 10% (26218ms vs 29306ms)

In both protogalaxy + sumcheck, the basic representation of the edge of the boolean hypercube is now a degree-1 monomial instead of a MAX_RELATION_DEGREE-degree monomial

The class UnivariateMonomial can efficiently evaluate low-degree monomial relations of up to degree-2. The relations in the relations directory have been reworked to perform initial low-degree algebraic computations using UnivariateMonomial, only converting to a full Monomial object once the UnivariateMonomial would otherwise exceed degree-2

Reason why we do all of this:

  1. for MegaFlavor, extend_edges was converting every flavour polynomial into a degree-11 Univariate. This was introducing 9 Fp additions * NUM_ALL_ENTITIES per row in the circuit. Given the sparse trace structure we are working with, this is a lot of computation that this PR makes redundant
  2. for each relation, we check if it can be skipped by typically calling is_zero on a selector. The selector poly is in Univariate form (MegaFlavor = degree-11) which is 11 Fp zero-checks. MegaFlavor has 9 skippable relations which is 99 Fp zero-checks. With the new degree-2 representation this is reduced to only 18 Fp zero-checks
  3. The number of raw Fp add and mul operations required to evaluate our relations is reduced. For example, in the permutation argument each */+ operation in the accumulate function was costing us 11 Fp muls/adds. It is cheaper to compute low-degree sub-terms in the coefficient representation before extend inginto point-evaluation representation

e.g. consider (in the protogalaxy case where challenges are degree-1 univariates) (w_i + \beta * S_i + \gamma) for i = 0,1,2,3. In coefficient representation this term can be computed with 8 Fp adds and 3 Fp muls. Extending into a degree-11 point evaluation form costs 18 Fp adds for a total of 26 Fp adds and 3 Fp muls.

In master branch, using Univariate<11> this computation costs us 20 Fp adds and 10 Fp muls. Assuming an add is 1/3 the cost of a mul, this makes the new approach cost 35 Fp add-equivalent operations vs 50 Fp add-equivalent

Overall in the new approach, the number of field operations to compute the permutation argument has reduced by 30%

Copy link
Contributor

github-actions bot commented Dec 4, 2024

Changes to circuit sizes

Generated at commit: 91ffc1a56dc870f358d9e252ab8bea60d25ba554, compared to commit: e23fd0dd95f193070e4a646ce5953d22e8a09476

🧾 Summary (100% most significant diffs)

Program ACIR opcodes (+/-) % Circuit size (+/-) %
rollup_base_public 0 ➖ 0.00% -14 ✅ -0.00%
rollup_base_private 0 ➖ 0.00% -14 ✅ -0.00%
rollup_block_root 0 ➖ 0.00% -42 ✅ -0.00%
rollup_block_merge 0 ➖ 0.00% -28 ✅ -0.00%
rollup_root 0 ➖ 0.00% -28 ✅ -0.00%
rollup_merge 0 ➖ 0.00% -28 ✅ -0.00%
parity_root 0 ➖ 0.00% -56 ✅ -0.00%
private_kernel_empty 0 ➖ 0.00% -14 ✅ -0.00%

Full diff report 👇
Program ACIR opcodes (+/-) % Circuit size (+/-) %
rollup_base_public 3,851,252 (0) 0.00% 12,617,703 (-14) -0.00%
rollup_base_private 3,619,773 (0) 0.00% 11,224,925 (-14) -0.00%
rollup_block_root 682,326 (0) 0.00% 4,293,357 (-42) -0.00%
rollup_block_merge 32,406 (0) 0.00% 1,900,773 (-28) -0.00%
rollup_root 32,390 (0) 0.00% 1,900,759 (-28) -0.00%
rollup_merge 1,791 (0) 0.00% 1,744,924 (-28) -0.00%
parity_root 4,290 (0) 0.00% 3,487,877 (-56) -0.00%
private_kernel_empty 967 (0) 0.00% 864,991 (-14) -0.00%

Copy link
Contributor

@jeanmon jeanmon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just looked at the AVM related changes.

@@ -121,6 +121,9 @@ class AvmFlavor {
using VerifierCommitmentKey = AvmFlavorSettings::VerifierCommitmentKey;
using RelationSeparator = AvmFlavorSettings::RelationSeparator;

// indicates when evaluating sumcheck, edges must be extended to be MAX_TOTAL_RELATION_LENGTH
static constexpr bool USE_SHORT_MONOMIALS = false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please port this change in bb-pilcom/bb-pil-backend/templates/flavor.hpp.hbs
Alternatively, you can ping AVM team and we can do it.

Note that the recursive flavor .hpp counterpart is not generated (therefore no change there)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just updated flavor.hpp.hbs

@@ -25,6 +25,9 @@ template <typename BuilderType> class AvmRecursiveFlavor_ {

using Relations = AvmFlavor::Relations_<FF>;

// indicates when evaluating sumcheck, edges must be extended to be MAX_TOTAL_RELATION_LENGTH
static constexpr bool USE_SHORT_MONOMIALS = false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to align it with the native flavor?
What I mean is to set it:
static constexpr bool USE_SHORT_MONOMIALS = NativeFlavor::USE_SHORT_MONOMIALS;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good call, updated

// Set the size corresponding to the number of rows in the execution trace
// Iterate over the prover polynomials' views corresponding to each proving key
for (size_t dpk_idx = 0; auto& get_all : prover_polynomials_views) {
// Iterate over all columns in the trace execution of an proving key and extract their value at row_idx.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Iterate over all columns in the trace execution of an proving key and extract their value at row_idx.
// Iterate over all columns in the trace execution of a proving key and extract their value at row_idx.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

typename Flavor::template ProverUnivariates<2> results;
// Set the size corresponding to the number of rows in the execution trace
// Iterate over the prover polynomials' views corresponding to each proving key
for (size_t dpk_idx = 0; auto& get_all : prover_polynomials_views) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get_all can be made const I believe and made renamed view

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you clarify by what you mean by "renamed view"? I don't fully understand

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry "made renamed" was a typo, I meant rename to view and make const. But no strong preference to the name, was just thinking that you're iteration over something called views

@@ -115,7 +115,8 @@ transcript. These operations are taken care of by \ref bb::BaseTranscript "Trans
The Sumcheck output is specified by \ref bb::SumcheckOutput< Flavor >.
*/
template <typename Flavor> class SumcheckProver {

// PartiallyEvaluatedMultivariates OR ProverPolynomials
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe move this comment above ProverPolynomials and PartiallyEvaluatedMultivariates

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

@maramihali maramihali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a couple of questions and requests for extra tests of interaction betweenUnivariate and UnivariateMonomial classes

@@ -54,6 +54,9 @@ template <typename BuilderType> class UltraRecursiveFlavor_ {
using NativeFlavor = UltraFlavor;
using NativeVerificationKey = NativeFlavor::VerificationKey;

// indicates when evaluating sumcheck, edges can be left as degree-1 monomials
static constexpr bool USE_SHORT_MONOMIALS = true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
static constexpr bool USE_SHORT_MONOMIALS = true;
static constexpr bool USE_SHORT_MONOMIALS = UltraFlavor::USE_SHORT_MONOMIALS;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@@ -40,6 +40,8 @@ template <typename BuilderType> class MegaRecursiveFlavor_ {
using NativeFlavor = MegaFlavor;
using NativeVerificationKey = NativeFlavor::VerificationKey;

// indicates when evaluating sumcheck, edges can be left as degree-1 monomials
static constexpr bool USE_SHORT_MONOMIALS = true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto here, use the value from the native flavor

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good call

// tmp *= scaling_factor;
// std::get<0>(evals) += tmp;

using MonomialAccumulator = typename Accumulator::MonomialAccumulator;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MonomialAccumulator can be used from above?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good call

@@ -83,46 +83,98 @@ template <typename FF_> class UltraArithmeticRelationImpl {
const FF& scaling_factor)
{
PROFILE_THIS_NAME("Arithmetic::accumulate");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems to me like everything has become a MonomialAccumulator so I think we could remove _m suffix from names, here and in all other relations

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some of the later terms in the relation will be of the Accumulator type (e.g. tmp0 in this file).

I figured as a general rule it might be easier to read what's going on by applying a heuristic that _m indicates the type is a MonomialAccumulator, given we use auto type declarations for frequently

@@ -40,6 +40,9 @@ class ECCVMFlavor {
using RelationSeparator = FF;
using MSM = bb::eccvm::MSM<CycleGroup>;

// indicates when evaluating sumcheck, edges must be extended to be MAX_TOTAL_RELATION_LENGTH
static constexpr bool USE_SHORT_MONOMIALS = false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add to translator and eccvm recursive flavor?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed. I'm curious why the tests all passed without this. Do we not compile tests that use the sumcheck Prover for the translator and eccvm recursive flavours?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, now that I think about it, this flag is only useful for the prover, and given we only ever instantiate the verifiers with recursive flavors the flag is not necessary in them, my bad

@@ -391,42 +405,43 @@ template <class DeciderProvingKeys_> class ProtogalaxyProverInternal {
// doesn't skip computation), so we need to define types depending on the template instantiation
using ThreadAccumulators = TupleOfTuples;
using ExtendedUnivatiatesType =
std::conditional_t<skip_zero_computations, ExtendedUnivariates, ExtendedUnivariatesNoOptimisticSkipping>;

std::conditional_t<Flavor::USE_SHORT_MONOMIALS,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same question here as to whether we still do optimistic skipping

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that we do still do optimistic skipping. I tried to trace where this type is used, and it seems to only be used when initialising the AllEntities data structure passed into the accumulate method for our relations.

In our relation classes, the Accumulator type is derived from the types in ContainerOverSubrelations. This in turn is the same as the type TupleOfTuplesOfUnivariates_ param in accumulate_relation_univariates from protogalaxy_prover_impl.hpp, which is the same as the TupleOfTuples param in compute_combiner which is, from what I can tell, the TupleOfTuplesOfUnivariates type defined in protogalaxy_prover.hpp and this type does support skipping

Phew. I might try to rationalise this a bit as I am not sure the template parameter propagation in the protogalaxy methods is required as the type used is defined in the class

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just pushed a change which removes the template parameter types in compute_combiner and accumulate_relation_univariates. The only time we need to not optimitically skip seems to be in combiner.test.cpp (which is why we needed parametrisable types), so I created bespoke methods for combiner.test.cpp by defining a new class PGInternalTest that inherits from ProtogalaxyProverInternal and adds back in methods that enable not skipping.

While this is more lines of code overall, I think it makes ProtogalaxyProverInternal cleaner and easier to read, as there are fewer template parameter propagation chains to follow when examining the code. wdyt?

limb_subproduct += (w_1_shift_m * w_2_shift_m);
auto non_native_field_gate_1_m = limb_subproduct;
non_native_field_gate_1_m -= (w_3_m + w_4_m);
auto non_native_field_gate_1 = ShortAccumulator(non_native_field_gate_1_m) * ShortAccumulator(q_3_m);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will you add a brief comment as to why the ShortAccumulator transformation is required here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added comment

@@ -184,16 +185,18 @@ template <typename FF_> class DatabusLookupRelationImpl {
template <typename Accumulator, size_t bus_idx, typename AllEntities, typename Parameters>
static Accumulator compute_write_term(const AllEntities& in, const Parameters& params)
{
using MonomialAccumulator = typename Accumulator::MonomialAccumulator;
using View = typename Accumulator::View;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

View and ParameterView seem to be not used anymore?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

auto lagrange_ecc_op = View(in.lagrange_ecc_op);
using MonomialAccumulator = typename Accumulator::MonomialAccumulator;

auto w_1 = Accumulator(MonomialAccumulator(in.w_l));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we handle things differently here in comparison to other relations?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the subrelation degree is very low (3) which makes it less efficient to use the coefficient basis, I think.

To do a degree-1 multiplication in the coefficient basis requires 3 Fp muls and 4 Fp adds (karatsuba multiplication). But a multiplication of a degree-3 Univariate only requires 3 Fp muls

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will add a comment explaining this


auto q_pos_by_scaling_m = (q_poseidon2_internal_m * scaling_factor);
auto q_pos_by_scaling = Accumulator(q_pos_by_scaling_m);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The call to Accumulator is because we need extra values to properly perform subsequent computations right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

correct

Moved methods in ProtogalaxyProverInternal that did not optimistically skip into a separate class `PGInternalTest` in `combiner.test.cpp`
renamed "MonomialAccumulator" to "CoefficientAccumulator"
Copy link
Contributor

@maramihali maramihali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates!

@@ -102,17 +102,17 @@ template <class Fr, size_t domain_end, size_t domain_start = 0, size_t skip_coun
evaluations[domain_end - 1] = prev;
}

// explicit operator UnivariateMonomial<Fr, 2, 0, 0>() const
// explicit operator UnivariateCoefficientBasis<Fr, 2, 0, 0>() const
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stale?

@zac-williamson zac-williamson merged commit 15475f4 into master Dec 19, 2024
77 checks passed
@zac-williamson zac-williamson deleted the zw/monomial-accumulaator branch December 19, 2024 15:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants