-
Notifications
You must be signed in to change notification settings - Fork 856
[MPT] Misc refactoring #972
[MPT] Misc refactoring #972
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wonderful! Thanks Brecht!
Didn't see at first the PR is still a draft - I removed the approval for now, will wait till it's finished. Nonetheless, I checked the first two commits and it really looks great! |
// Currently, the extension node S and extension node C both have the same key RLC - | ||
// however, sometimes extension node can be replaced by a shorter extension node | ||
// (in terms of nibbles), this is still to be implemented. | ||
// TODO: extension nodes of different nibbles length |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODOs related to the extension nodes of different lengths can be removed - some additional rows were needed to cover these cases and the constraints are implemented in these rows, no changes of the constraints in the extension node rows will be needed (the additional rows implementation is in #914).
// - hashed branch has RLP_HASH_VALUE at c_rlp2 and hash in c_advices, | ||
// - non-hashed branch has 0 at c_rlp2 and all the bytes in c_advices | ||
// TODO(Brecht): why different layout for hashed values? If for hash detection | ||
// just do == 32? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This way it was easier to check whether the branch is hashed or not (just using c_rlp2 * c160_inv
). But I can change the witness generator quickly if you think it would be better to have the same layout for both cases (but more complex expression).
) -> Self { | ||
// TODO(Brecht): strangely inconsistent between storage/account (see the need of | ||
// for_placeholder_s). Something more similar to how the drifted key | ||
// works (s and c cases separately makes more sense to me). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it should be the same for storage and account (both either using S or C). It works as it is, because in the non-existing proof both (S and C) proofs are the same. But we should decide on which one to use and then be consistent.
) | ||
}; | ||
// TODO(Brecht): somehow the start index doesn't dependend on the list | ||
// is_short/is_long? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I should mention this more in the code - is_long
is not fully implemented - it's very unlikely that the number of extension node nibbles would be big enough to make the extension node longer than 55 bytes (having 32 bytes for a branch hash, that would mean around 40 nibbles (20 bytes)), so I decided not to continue with is_long
implementation, but didn't want to delete the parts that were already implemented - maybe it would be better to do so?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it's possible we should implement it! It'd be great if there's a test case for this so it can be tested.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, agreed. It should be easier now with all the tooling you provided :). I will prepare a test and then try to prepare a PR with long
implementation.
} | ||
(ext_node_rlc.expr(), config.ext_rlp_key.num_bytes(), config.ext_is_not_hashed.expr()) | ||
}; | ||
// TODO(Brecht): why not if it's a placeholder? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Placeholder branch just fills the rows (mostly, except for the drifted_node
RLC) to preserve the parallel layout, so the hash that is needed to be checked in the proof with the placeholder branch is the hash of a leaf to be in the branch above the placeholder branch.
I would like to propose this way of querying the columns, I think it is more readable |
root_prev: values[3], | ||
root: values[4], | ||
}) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I now understand the approach with store/load
lookups. I very much like it! I would only perhaps change the names of the functions:
witness_store
is intuitive - it stores the values in the array for later usewitness_load
seems a bit less intuitive to me - what aboutwitness_assign
?store
adds an entry to the lookup table, so perhapsadd_to_lookup_table
?load
adds a lookup to be executed, so perhapsadd_lookup
?
But I trust your choices, you can leave it as it is if you disagree with the proposed names.
I have one question also - you added key
to the table to enable adding lookups with offset
different from 0, right? As it is now (offset
always 0), the table could be without key
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
halo2 uses assign verb, so I would keep using that one, if it is about assigning a witness.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I named it store
/load
is because I wanted to make it clear what it does, instead of how it does it (which is more of an implementation detail). Because if you use it you don't really care how it does, just that you have something that behaves like some kind of memory. Ideally this can be used by people even if they don't even know (and don't really have to care) how it works. And if only the latest data needs to be loaded it's actually possible to implement the same thing without lookups so easily switched out with another implementation with the same behavior then.
I have one question also - you added key to the table to enable adding lookups with offset different from 0, right? As it is now (offset always 0), the table could be without key?
The key is still important for the current lookup based implementation even when always loading the last data to make sure the latest data is actually loaded. For example:
row | instruction | key | memory_value |
---|---|---|---|
0 | store(a) | 0 | |
1 | 1 | a | |
2 | load(key.cur(), a) | 1 | |
3 | 1 | ||
4 | store(b) | 1 | |
5 | load(key.cur(), b) | 2 | b |
Without the key, on row 5 for example, you could do load(a)
, and because a
is in the lookup table it would be a valid lookup, but it shouldn't be because the latest stored value wouldn't be loaded (which is not what you'd expect from the behavior of something similar to memory).
halo2 uses assign verb, so I would keep using that one, if it is about assigning a witness.
Normally very much agree, but this one is a bit non-standard because on the witness generation it implements the same kind of mechanism. So if you want to load data you actually just call load and it will fill in the correct data automatically while for store you have to specify the values that need to be stored.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I see. Thanks!
Will merge it now - I haven't go through every detail of the refactoring yet, but it's probably better to merge it now and enable sync with master. I will finish the review in parallel with rewriting the specs. |
Some refactoring which I believe decreases code duplication and increases code readability.
Some TODOs: