Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Non-breaking changes] Dynamic state snapshot #379

Closed
wants to merge 94 commits into from
Closed

[Non-breaking changes] Dynamic state snapshot #379

wants to merge 94 commits into from

Conversation

trinhdn2
Copy link

This PR creates a secondary data structure for storing the Ethereum state, called a snapshot. This snapshot is special as it dynamically follows the chain and can also handle small-ish reorgs:

  • At the very bottom, the snapshot consists of a disk layer, which is essentially a semi-recent full flat dump of the account and storage contents. This is stored in LevelDB as a <hash> -> <account> mapping for the account trie and <account-hash><slot-hash> -> <slot-value> mapping for the storage tries. The layout permits fast iteration over the accounts and storage, which will be used for a new sync algorithm.

  • Above the disk layer there is a tree of in-memory diff layers that each represent one block's worth of state mutations. Every time a new block is processed, it is linked on top of the existing diff tree, and the bottom layers flattened together to keep the maximum tree depth reasonable. At the very bottom, the first diff layer acts as an accumulator which only gets flattened into the disk layer when it outgrows it's memory allowance. This is done mostly to avoid thrashing LevelDB.

The snapshot can be built fully online, during the live operation of a Geth node. This is harder than it seems because rebuilding the snapshot for mainnet takes 9 hours, during which the in-memory garbage collection long deletes the state needed for a single capture.

  • The PR achieves this by gradually iterating the state tries and maintaining a marker to the account/storage slot position until which the snapshot was already generated. Every time a new block is executed, state mutations prior to the marker get applied directly (the ones afterwards get discarded) and the snapshot builder switches to iterating the new root hash.
  • To handle reorgs, the builder operates on HEAD-128 and is capable of suspending/resuming if a state is missing (a restart will only write out some tries, not all cached in memory).

The benefit of the snapshot is that it acts as an acceleration structure for state accesses:

  • Instead of doing O(log N) disk reads (+leveldb overhead) to access an account / storage slot, the snapshot can provide direct, O(1) access time. This should be a small improvement in block processing and a huge improvement in eth_call evaluations.
  • The snapshot supports account and storage iteration at O(1) complexity per entry + sequential disk access, which should enable remote nodes to retrieve state data significantly cheaper than before (the sort order is the state trie leaf order, so responses can directly be assembled into tries too).
  • The presence of the snapshot can also enable more exotic use cases such as deleting and rebuilding the entire state trie (guerilla pruning) as well as building alternative state trie (e.g. binary vs. hexary), which might be needed in the future.

The downside of the snapshot is that the raw account and storage data is essentially duplicated. In the case of mainnet, this means an extra 15GB of SSD space used.

Prerequisites:

Refs:

trinhdn97 and others added 30 commits June 16, 2023 14:30
@trinhdn2 trinhdn2 changed the title Ft/dynamic state snapshot Dynamic state snapshot Aug 16, 2023
@trinhdn2 trinhdn2 changed the title Dynamic state snapshot [Non-breaking changes] Dynamic state snapshot Aug 18, 2023
@trinhdn2 trinhdn2 marked this pull request as ready for review August 18, 2023 04:21
@tungng98 tungng98 deleted the branch BuildOnViction:upgrade-core-develop December 10, 2023 16:51
@tungng98 tungng98 closed this Dec 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants