-
Notifications
You must be signed in to change notification settings - Fork 114
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: implement .era2 (state snapshots) import/export from Trin Execution #1344
Conversation
I know this is still in progress and maybe you already had this in mind, but this pr is huge! Please split it up in smaller parts before sending for review. I assume some work is already done, even if not final, and can already be reviewed. For example new structure/type definitions and helping structures/functions around them (without introducing much new logic with the existing code, if not needed). |
It is hard to take this comment seriously #1344 (comment) when if it should be said at all, it should be said on this PR #1337 Realistically this work can be broken up into 3 PR's, and it is already in 2 #1344 and #1337, the 3rd one would be a PR for fixing sender recovery I am not going to open a PR for review until I make a dump of state at block 15 million
I am sure not everyone was aware this PR was already split up into 2 PRs, these PR's are drafts for a reason, they are not ready to be reviewed. These PR's will be ready for review in a week after that is done
|
My comment wasn't specifically tied to this PR, and definitely not to this PR at the current stage (still in progress), but in general. I apologize if you were planning on doing this anyway and my feedback was coming on not desired time (not being send for review), but I wanted to provide early feedback and avoid delays later on (I find it easier to split work while I'm working on it, not when everything is done, but that's just my personal workflow). I commented on this PR because you updated it recently. And I acknowledge that you already split it up before, but I believe that they should be split into even smaller PRs. And again, I apologize if this is something that you were planning on doing anyway |
I don't think this is logical to try to merge .era2 without import or export, how do we know if .era2 is even useful without that. Making a PR for unused types is redundant, I don't find value in that. I think there is a lot of value in being able to export an .era2 file and import it from the PR adding .era2.
The main delay for me "opening these PR's for review", is I am currently executing blocks using this code. I am not really concerned with I am not really able to test .era2 until I am done executing, I have brought this up in meetings. |
I feel that @morph-dev has a reasonable point here that should be highlighted. There is an often cited study that showed that code review effectiveness drops off significantly after 400 lines of code and is most effective when kept under 200 lines of code. These shouldn't be strict rules of course. Which is probably why the suggestion was made to extract any pieces of this (like the types) into a standalone PR. You are correct that the types aren't much use on their own. However, the act of splitting them into their own PR has value on it's own because that PR is easy to review AND it reduces the size of the final PR which means that the other members of the team will both have an easier time reviewing this code and the review they give will be higher quality. |
@morph-dev
I am fine to split .era2 types and import/export into 2 different PR's, but I can't do that in good faith till it is battle tested around 4-5 days from now. The 3 PR's I made above don't make any changes to the database code so they are fine to be reviewed right now |
67ff3df
to
50a31fb
Compare
I will send this back from "in review" to "in progress" since it's got some conflicts to resolve |
a2366f5
to
fdf711e
Compare
abd0c35
to
26444cb
Compare
@morph-dev this PR is ready for review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that leaf iteration would not work on storage tries. Did you test this code and does it work? (e.g. on block 1_000_000)?
I also think that you can utilize some functionality from ethportal-api/src/types/state_trie/trie_traversal.rs
. It would be even better if we can move this traversal there as well (or at least partially). I have an idea on how it can be done nicely, but I have to try it out to see if it works.
trin-execution/src/cli.rs
Outdated
} | ||
|
||
#[derive(Args, Debug, Default, Clone, PartialEq)] | ||
pub struct ImportState { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Maybe ImportStateConfig
? At least for me, State
implies that it actually contains some state.
Same below.
trin-execution/src/main.rs
Outdated
// insert the last 256 block hashes into the database | ||
let first_block_hash_to_add = max( | ||
0, | ||
state_importer.trin_execution.next_block_number() - BLOCKHASH_SERVE_WINDOW, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this would panic if state_importer.trin_execution.next_block_number()
is less than BLOCKHASH_SERVE_WINDOW
trin-execution/src/main.rs
Outdated
let mut state_importer = StateImporter::new(trin_execution, import_state); | ||
state_importer.import_state()?; | ||
|
||
// insert the last 256 block hashes into the database |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I think that maybe this entire logic (of inserting 256 block headers) should live somewhere in subcommands folder.
}; | ||
|
||
let mut storage: Vec<StorageItem> = vec![]; | ||
if account_state.storage_root != keccak256([EMPTY_STRING_CODE]) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: shouldn't we use EMPTY_ROOT_HASH
constant?
} | ||
|
||
// Get the rounded up storage count | ||
let storage_count = storage.len() / MAX_STORAGE_ITEMS |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you probably want to use div_ceil:
let storage_count = storage.len().div_ceil(MAX_STORAGE_ITEMS);
storage_count: storage_count as u32, | ||
}))?; | ||
|
||
for _ in 0..storage_count { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you probably want to use Vec::chunks or itertools::chunks
I would prefer standard one, but I'm not sure if it can avoid copying elements. The itertools can.
You drain approach, while it avoids making copies, would still cause a lot of objects in memory moving around (as you are draining from the beginning of a Vec)
)))?; | ||
} | ||
|
||
if accounts_processed % 10000 == 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: maybe do this printing after you increase the counter.
Or if you do it before the increment because you want to see 0
being printed, do it at the start of the loop (but that's redundant as you already have "Trie leaf iterator initiated" printed at the start).
After spending quite some time implementing trie traversal myself (pr/1344 branch in my trin fork), I realized that |
I both exported and imported .era2 files I generated at block 1 million and 15 million so I can factually say this code properly handles storage tries |
I believe that it works for your cases. After a bit of thinking, I now think that it's likely going to work on all blocks. But there is a possible case when it might not. And while chances of such case happening are very low, it's theoretically possible. Now that we are switching to the iteration done by EthTrie, this becomes irrelevant. I still want to document the problem, as it's important concept to understand regarding how MPT works in general. The issue is the following. If encoding of a node is less than 32 bytes, its parent would inline that node directly (instead of using hash of it). For account tries, this can't happen because:
For storage trie, this is different, as values that are stored can be very short (only 1 byte). But since leaf nodes include path from parent node, in majority of cases, encoding of the entire leaf node would be at least 32 bytes (there are some extra bytes used by rlp encoding itself). But if the tree is deep, this path can be small enough, resulting in leaf node being inlined directly into parent (instead of as hash), and if encoding of that parent is small, it can be included in its own parent, etc. Your code handles the case when child of a branch node is inlined leaf node (so I assume you encountered this is practice), but it's technically possible for child of a branch or extension node to be inlined another branch or extension node. Chances for this happening are very-very small (because each additional node that you are inlining adds some, potentially quite a bit, extra bytes for encoding), and we might not encounter it in practice. But I would say it's important corner case that should be covered so that we are resilient regardless of the size (depth) of the trie. |
The original question was if when I ran my code it worked in the first place which I said it did, I didn't comment on 1 byte data being compacted, because it wasn't asked or seemed probable to our usecase, but as you said |
@morph-dev ready for another review, I believe I resolved all the PR's concerns. I also tested the changes by
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good work
@@ -24,6 +24,17 @@ impl Default for Account { | |||
} | |||
} | |||
|
|||
impl From<AccountStateInfo> for Account { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: not related to this PR, but is there any particular reason why we need Account
type? Can't we just use AccountState
from ethportal_api
?
The only additional functionality that it has is from_account_info
, which I think we don't want to add to ethportal_api
(we don't want to add revm_primitives
as deps), but considering that it's used only in one place, we can just inline it (or create utility function).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since they are already in their own module and have non trivial logic, I would extract imported and exported into separate files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't particularly like that import and export are part of the main executable. It doesn't make sense with the rest of the functionality of the main.rs
(I think they don't even share any command line flags).
Why not make them as two separate binaries (e.g. src/bin/import.ts
and src/bin/export.rs
), each with their own configuration?
If you think this is good idea, but want to get this PR in asap, we can do it in a separate PR (I have some free cycles, so I can do it). What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The team has discussed this in the past and the conclusion has came to including utility sub-commands in the main binary as then you always know the utility's work with your binary version. It is also less tedious to use them. Also most Execution Clients I read follow this practice. Multiple binaries feels like a lot of work both from a users perspective and developer's perspective
if let Some(command) = trin_execution_config.command { | ||
match command { | ||
TrinExecutionSubCommands::ImportState(import_state) => { | ||
let mut state_importer = StateImporter::new(trin_execution, import_state); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's weird to pass trin_executable
to it. I understand that we have some useful objects there (db), but it just looks weird. If anything, it should return TrinExecution
object (but that doesn't make sense if we just finish after is done).
let mut era2 = Era2::create(self.exporter_config.path_to_era2.clone(), header)?; | ||
info!("Era2 initiated"); | ||
info!("Trie leaf iterator initiated"); | ||
let mut accounts_processed = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: maybe accounts_exported
?
vec![] | ||
}; | ||
|
||
let mut storage: Vec<StorageItem> = vec![]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I don't know if this is big problem, but I don't like that we are storing all storage items in memory, instead of exporting them to file. Don't know how big storage can get and if this is an issue (at least on weaker machines).
I know that this is needed in order to write storage_count
in the AccountEntry
first. The possible solutions would be:
- iterate entire trie once and count number of items, write
AccountEntry
, iterate second time and write as you go- considering that reading from db is most likely the slowest part, this would make export twice as slow
- Start processing as we do now.
- if we finish processing entire storage trie before getting to
MAX_STORAGE_ITEMS
, do as we do now (in which case, there will be only one StorageEntry) - If we process
MAX_STORAGE_ITEMS
, stop, and do some variation of 1. from above.
- this approach gives us best performance in majority of cases (single StorageEntry), but still handles big storage tries. but it's not elegant, and questionable how much it would actually slow things down
- if we finish processing entire storage trie before getting to
Overall, I think current implementation is good for now, and if we notice issue with later (bigger) blocks, we can evaluate if improvement is needed. So no action is expected here (except if you want to create an issue to document this and provide your opinion).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If slower machines can't even run an execution client, I am not sure why they should be able to generate a .era2
file.
They should minimum have 16GB of ram, and if that is the case there should be no problem, and if they need more they can add a swap file.
This isn't an issue, creating .era2
files is already
- expensive
- slow
- niche
99% of people won't be generating these files, they will be using the files we generated.
So I don't think we should optimize this for the lowest case, they can use a .era2 file we generated or use a functional machine.
|
||
// Build storage trie | ||
let account_db = AccountDB::new(address_hash, self.trin_execution.database.db.clone()); | ||
let mut account_trie = EthTrie::new(Arc::new(account_db)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: this should be storage_trie
{ | ||
account_trie | ||
.insert(storage_index_hash.as_slice(), &alloy_rlp::encode(value))?; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we call account_trie.root_hash()?
once per StorageEntry
, in order to commit change to db and lower RAM usage?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure
|
||
// Insert contract if available | ||
if !bytecode.is_empty() && account_state.code_hash != KECCAK_EMPTY { | ||
ensure!( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: With this check being inside if
, it's possible that we will accept invalid era2 file, for example: bytecode provided with this account is empty but code_hash is not (and serious attack vector if the same code_hash is hash of bytecode of some other account).
Therefore, I think it's better to move this check before if
statement. And with this check present, we don't need both conditions in the if
statement (!bytes.is_empty()
is enough).
let mut era_manager = EraManager::new(first_block_hash_to_add).await?; | ||
for block_number in first_block_hash_to_add..self.trin_execution.next_block_number() { | ||
let block = era_manager.get_next_block().await?; | ||
ensure!( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I'm not sure how I feel about this check. On one side, it's good to double check, on the other side, we rely on EraManager
returning blocks in consecutive order in other places.
If it was me, I would have trusted that EraManager
does the correct job, and just do something like this:
while era_manager.next_block_number() < self.trin_execution.next_block_number() {
let block = era_manager.get_next_block().await?;
self.trin_execution.database.db.put(...);
}
Up to you.
What was wrong?
we have
.era2
types ine2store
, but we don't support importing or exporting these files usingtrin-execution
How was it fixed?
By implementing a minimal trie walking implementing which provides an api where you can loop over the leafs of a trie.
Then we implement