[MPT] Misc refactoring #972

Brechtpd · 2022-12-07T18:27:02Z

Some refactoring which I believe decreases code duplication and increases code readability.

Some TODOs:

The code required to get the previous rlc/mult data or the inclusion in the parent check is quite complicated because of the different node types. Instead it may be better to use a fixed location to store this data in so a node can simply use this data directly instead of having to figure out on its own where to find the data. This is cleaner because this way each node can decide on its own how these should be handled.
Currently RLP decoding is done using a couple of selectors that are inputs from the prover. Then there are some checks if these are correct, though these are not complete. I think it's easier to think about this if we would just use a lookup to directly verify if these selectors are set correctly so we don't have to worry about edge cases are cases that are hard to constrain using custom gates.
There are currently many cases in the main state maching because each row type is it's own state. However there is not really any reuse between custom gates between these rows, except for branches. It'll likely be quite a bit simple to just have a single state for account, storage and extension and just use multiple rows in those states
May be a good idea to split up branches and extension nodes. (semi done)
There's a couple of circuit tools that were added in this PR to make writing the MPT circuit more manageable. A lot of these tools can still be greatly improved.
Number of lookups has been reduced a lot, but there are still many optimization possibilities (have not looked into reducing the expression degree for example).
The circuit uses a fixed layout which uses around 100 columns (of which a lot need to be byte constrained so needs a lot of lookups as well). This makes the circuit quite a bit more dense than probably required, a more flexible way to manager the required data so the width/height can be choses would be very useful I think.

miha-stopar

Wonderful! Thanks Brecht!

zkevm-circuits/src/mpt_circuit/selectors.rs

miha-stopar · 2022-12-14T16:07:34Z

Wonderful! Thanks Brecht!

Didn't see at first the PR is still a draft - I removed the approval for now, will wait till it's finished. Nonetheless, I checked the first two commits and it really looks great!

miha-stopar · 2023-02-14T13:03:46Z

zkevm-circuits/src/mpt_circuit/branch/branch_key.rs

+                    // Currently, the extension node S and extension node C both have the same key RLC -
+                    // however, sometimes extension node can be replaced by a shorter extension node
+                    // (in terms of nibbles), this is still to be implemented.
+                    // TODO: extension nodes of different nibbles length


TODOs related to the extension nodes of different lengths can be removed - some additional rows were needed to cover these cases and the constraints are implemented in these rows, no changes of the constraints in the extension node rows will be needed (the additional rows implementation is in #914).

zkevm-circuits/src/mpt_circuit/branch/branch_key.rs

miha-stopar · 2023-02-14T13:37:19Z

zkevm-circuits/src/mpt_circuit/branch/extension_node.rs

+            // - hashed branch has RLP_HASH_VALUE at c_rlp2 and hash in c_advices,
+            // - non-hashed branch has 0 at c_rlp2 and all the bytes in c_advices
+            // TODO(Brecht): why different layout for hashed values? If for hash detection
+            // just do == 32?


This way it was easier to check whether the branch is hashed or not (just using c_rlp2 * c160_inv). But I can change the witness generator quickly if you think it would be better to have the same layout for both cases (but more complex expression).

zkevm-circuits/src/mpt_circuit/branch/extension_node.rs

zkevm-circuits/src/mpt_circuit/storage_leaf/leaf_key.rs

zkevm-circuits/src/mpt_circuit/storage_leaf/leaf_value.rs

MPT layout refactor

miha-stopar · 2023-03-01T13:14:51Z

zkevm-circuits/src/mpt_circuit/helpers.rs

+    ) -> Self {
+        // TODO(Brecht): strangely inconsistent between storage/account (see the need of
+        // for_placeholder_s). Something more similar to how the drifted key
+        // works (s and c cases separately makes more sense to me).


Yes, it should be the same for storage and account (both either using S or C). It works as it is, because in the non-existing proof both (S and C) proofs are the same. But we should decide on which one to use and then be consistent.

miha-stopar · 2023-03-01T13:26:03Z

zkevm-circuits/src/mpt_circuit/helpers.rs

                )
            };
+            // TODO(Brecht): somehow the start index doesn't dependend on the list
+            // is_short/is_long?


Yeah, I should mention this more in the code - is_long is not fully implemented - it's very unlikely that the number of extension node nibbles would be big enough to make the extension node longer than 55 bytes (having 32 bytes for a branch hash, that would mean around 40 nibbles (20 bytes)), so I decided not to continue with is_long implementation, but didn't want to delete the parts that were already implemented - maybe it would be better to do so?

If it's possible we should implement it! It'd be great if there's a test case for this so it can be tested.

Yes, agreed. It should be easier now with all the tooling you provided :). I will prepare a test and then try to prepare a PR with long implementation.

miha-stopar · 2023-03-01T14:07:38Z

zkevm-circuits/src/mpt_circuit/branch.rs

                        }
+                        (ext_node_rlc.expr(), config.ext_rlp_key.num_bytes(), config.ext_is_not_hashed.expr())
+                    };
+                    // TODO(Brecht): why not if it's a placeholder?


Placeholder branch just fills the rows (mostly, except for the drifted_node RLC) to preserve the parallel layout, so the hash that is needed to be checked in the proof with the placeholder branch is the hash of a leaf to be in the branch above the placeholder branch.

leolara · 2023-03-01T16:56:10Z

I would like to propose this way of querying the columns, I think it is more readable

Brechtpd#8

miha-stopar · 2023-03-08T16:11:11Z

zkevm-circuits/src/mpt_circuit/helpers.rs

+            root_prev: values[3],
+            root: values[4],
+        })
+    }


I think I now understand the approach with store/load lookups. I very much like it! I would only perhaps change the names of the functions:

witness_store is intuitive - it stores the values in the array for later use

witness_load seems a bit less intuitive to me - what about witness_assign?

store adds an entry to the lookup table, so perhaps add_to_lookup_table?

load adds a lookup to be executed, so perhaps add_lookup?

But I trust your choices, you can leave it as it is if you disagree with the proposed names.

I have one question also - you added key to the table to enable adding lookups with offset different from 0, right? As it is now (offset always 0), the table could be without key?

halo2 uses assign verb, so I would keep using that one, if it is about assigning a witness.

I named it store/load is because I wanted to make it clear what it does, instead of how it does it (which is more of an implementation detail). Because if you use it you don't really care how it does, just that you have something that behaves like some kind of memory. Ideally this can be used by people even if they don't even know (and don't really have to care) how it works. And if only the latest data needs to be loaded it's actually possible to implement the same thing without lookups so easily switched out with another implementation with the same behavior then.

I have one question also - you added key to the table to enable adding lookups with offset different from 0, right? As it is now (offset always 0), the table could be without key?

The key is still important for the current lookup based implementation even when always loading the last data to make sure the latest data is actually loaded. For example:

row instruction key memory_value

0 store(a) 0

1 1 a

2 load(key.cur(), a) 1

3 1

4 store(b) 1

5 load(key.cur(), b) 2 b

Without the key, on row 5 for example, you could do load(a), and because a is in the lookup table it would be a valid lookup, but it shouldn't be because the latest stored value wouldn't be loaded (which is not what you'd expect from the behavior of something similar to memory).

halo2 uses assign verb, so I would keep using that one, if it is about assigning a witness.

Normally very much agree, but this one is a bit non-standard because on the witness generation it implements the same kind of mechanism. So if you want to load data you actually just call load and it will fill in the correct data automatically while for store you have to specify the values that need to be stored.

Ok, I see. Thanks!

miha-stopar · 2023-04-05T11:24:39Z

Will merge it now - I haven't go through every detail of the refactoring yet, but it's probably better to merge it now and enable sync with master. I will finish the review in parallel with rewriting the specs.

[MPT] Selectors refactor

136f5c4

github-actions bot added the crate-zkevm-circuits Issues related to the zkevm-circuits workspace member label Dec 7, 2022

miha-stopar approved these changes Dec 8, 2022

View reviewed changes

zkevm-circuits/src/mpt_circuit/selectors.rs Outdated Show resolved Hide resolved

zkevm-circuits/src/mpt_circuit/selectors.rs Outdated Show resolved Hide resolved

More refactoring (branch)

2db02ad

Brechtpd changed the title ~~[MPT] Selectors refactor~~ [MPT] Misc refactoring Dec 13, 2022

Refactoring extension node

132b924

miha-stopar self-requested a review December 14, 2022 16:05

Brechtpd added 22 commits December 15, 2022 02:55

More extension node refactoring

22e95d9

Continue extension node refactoring

d79e616

More refactoring

eef8076

Account leaf refactoring

59f657c

Storage leaf refactoring

2b21edc

Account leaf refactoring

e00e388

refactoring storage leaf/proof chain

e2196ae

Refactor lookups/Unify config building

e7e6779

Start zero check refactoring

f351912

Misc refactoring

49b1876

Lookup optimizations/misc refactoring

0c077f6

Misc small improvements

5571974

Branch/misc improvements

d7993b6

Branch/extension improvements

2b203e5

Extension node/key improvements

cfe559a

Key/misc improvements

0e771ef

Account leaf/lookup improvements

0233dcf

Account leaf improvements

73132c1

Storage leaf key improvements

740f187

Storage leaf improvements

3fd3d1f

Split off reusable circuit tools

bb6c996

Misc key/rlc improvements

89c2203

miha-stopar reviewed Feb 14, 2023

View reviewed changes

leolara removed their request for review February 14, 2023 15:17

Storage root improvements

63ce1a9

miha-stopar reviewed Feb 15, 2023

View reviewed changes

zkevm-circuits/src/mpt_circuit/storage_leaf/leaf_key.rs Outdated Show resolved Hide resolved

zkevm-circuits/src/mpt_circuit/storage_leaf/leaf_value.rs Outdated Show resolved Hide resolved

zkevm-circuits/src/mpt_circuit/storage_leaf/leaf_value.rs Outdated Show resolved Hide resolved

Brechtpd added 7 commits February 16, 2023 03:16

More storage leaf improvements

4ecbd0f

Start account leaf layout refactoring

4e6cc2b

Account leaf refactor

7896566

branch refactor

1108e1e

Misc improvements

00d1a89

Misc small improvements

fb94d83

Merge pull request #7 from Brechtpd/mpt-refactor-layout

7c89e6f

MPT layout refactor

miha-stopar reviewed Mar 1, 2023

View reviewed changes

Brechtpd added 5 commits March 2, 2023 03:30

Better proof init

fb2aef7

Proof type/consistency improvements

f3a8dcc

MPT lookup table data improvements

466121d

Drifted/non-existing improvements

38838c2

Some RLP layout changes

16bec41

miha-stopar reviewed Mar 8, 2023

View reviewed changes

Brechtpd added 7 commits March 9, 2023 02:53

Fix storage value/improve byte layout

5f65f65

Split branch and extension logic

05898c0

Reduce branch data and data per row (single RLP value)

4fd1aaa

Fix branch parent check + partial RLP decoding sharing

4deaa7a

Re-enable byte range/zero checking

b42b3d5

Misc small improvements/refactoring

67d68e8

Misc cleanup

c0adb10

miha-stopar marked this pull request as ready for review April 5, 2023 11:20

miha-stopar requested a review from a team as a code owner April 5, 2023 11:20

miha-stopar merged commit 3ac930f into privacy-scaling-explorations:mpt2 Apr 5, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MPT] Misc refactoring #972

[MPT] Misc refactoring #972

Brechtpd commented Dec 7, 2022 •

edited

Loading

miha-stopar left a comment

miha-stopar commented Dec 14, 2022

miha-stopar Feb 14, 2023

miha-stopar Feb 14, 2023

miha-stopar Mar 1, 2023

miha-stopar Mar 1, 2023

Brechtpd Mar 1, 2023

miha-stopar Mar 2, 2023

miha-stopar Mar 1, 2023

leolara commented Mar 1, 2023

miha-stopar Mar 8, 2023

leolara Mar 8, 2023

Brechtpd Mar 8, 2023 •

edited

Loading

miha-stopar Mar 9, 2023

miha-stopar commented Apr 5, 2023

row	instruction	key	memory_value
0	store(a)	0
1		1	a
2	load(key.cur(), a)	1
3		1
4	store(b)	1
5	load(key.cur(), b)	2	b

[MPT] Misc refactoring #972

[MPT] Misc refactoring #972

Conversation

Brechtpd commented Dec 7, 2022 • edited Loading

miha-stopar left a comment

Choose a reason for hiding this comment

miha-stopar commented Dec 14, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

leolara commented Mar 1, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Brechtpd Mar 8, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

miha-stopar commented Apr 5, 2023

Brechtpd commented Dec 7, 2022 •

edited

Loading

Brechtpd Mar 8, 2023 •

edited

Loading