Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(pruning): prune ChangeSets & History during pipeline #3728

Merged
merged 29 commits into from
Jul 31, 2023
Merged

Conversation

joshieDo
Copy link
Collaborator

@joshieDo joshieDo commented Jul 11, 2023

Resolves #3433
Resolves #3434

  • Adds account and storage changeset pruning during execution stage (if set). If so, it will also prune during the AccountHistory and StorageHistory stages.
  • If pruning is enabled, hashing and merkle stages might have to be run from scratch on subsequent pipeline runs if there are not enough changesets. (more context)

Full pruning is not supported, and will throw error at reading the configuration. Minimum 64 block distance.

@joshieDo joshieDo changed the title pruning: prune ChangeSets & History during pipeline feat(pruning): prune ChangeSets & History during pipeline Jul 11, 2023
@joshieDo joshieDo added C-enhancement New feature or request A-db Related to the database labels Jul 11, 2023
@codecov
Copy link

codecov bot commented Jul 11, 2023

Codecov Report

Merging #3728 (c7d83ec) into main (1ac2f15) will increase coverage by 0.00%.
The diff coverage is 82.71%.

Impacted file tree graph

Files Changed Coverage Δ
bin/reth/src/node/mod.rs 11.89% <0.00%> (-0.10%) ⬇️
bin/reth/src/stage/dump/merkle.rs 0.00% <0.00%> (ø)
bin/reth/src/stage/run.rs 1.28% <0.00%> (-0.13%) ⬇️
crates/primitives/src/lib.rs 100.00% <ø> (ø)
crates/prune/src/error.rs 0.00% <ø> (ø)
crates/stages/src/error.rs 83.33% <ø> (ø)
crates/primitives/src/prune/part.rs 75.00% <33.33%> (-25.00%) ⬇️
crates/stages/src/stages/hashing_account.rs 97.87% <87.50%> (+0.08%) ⬆️
crates/stages/src/stages/merkle.rs 83.06% <87.87%> (+0.74%) ⬆️
crates/stages/src/stages/hashing_storage.rs 96.94% <90.47%> (+0.11%) ⬆️
... and 7 more

... and 12 files with indirect coverage changes

Flag Coverage Δ
integration-tests 16.25% <0.00%> (-0.05%) ⬇️
unit-tests 64.22% <82.71%> (+0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
reth binary 25.76% <0.00%> (-0.15%) ⬇️
blockchain tree 83.04% <ø> (ø)
pipeline 90.03% <95.67%> (+0.19%) ⬆️
storage (db) 74.30% <100.00%> (+0.02%) ⬆️
trie 94.70% <ø> (ø)
txpool 45.40% <ø> (-0.61%) ⬇️
networking 77.64% <ø> (-0.06%) ⬇️
rpc 58.52% <ø> (-0.02%) ⬇️
consensus 63.51% <ø> (ø)
revm 33.10% <ø> (ø)
payload builder 6.58% <ø> (ø)
primitives 88.05% <81.81%> (+0.11%) ⬆️

@joshieDo joshieDo marked this pull request as ready for review July 16, 2023 16:47
@joshieDo
Copy link
Collaborator Author

since there was some overlap, should #3733 be pushed first

@joshieDo joshieDo marked this pull request as draft July 25, 2023 18:21
@joshieDo joshieDo marked this pull request as ready for review July 25, 2023 22:36
Copy link
Collaborator

@shekhirin shekhirin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM overall, nits. Appreciate the tests!

crates/primitives/src/prune/target.rs Show resolved Hide resolved
crates/stages/src/stages/index_account_history.rs Outdated Show resolved Hide resolved
crates/stages/src/stages/mod.rs Show resolved Hide resolved
crates/stages/src/stages/index_account_history.rs Outdated Show resolved Hide resolved
@joshieDo joshieDo added this pull request to the merge queue Jul 26, 2023
@joshieDo joshieDo removed this pull request from the merge queue due to a manual request Jul 26, 2023
@joshieDo
Copy link
Collaborator Author

joshieDo commented Jul 28, 2023

With this PR, lower end machines might not be able to catch up if their pruning is very aggressive. This happens because the pipeline re-run that happens after the first sync will trigger the hashing and merkle stages to be run from scratch since there might not be enough changesets. And if it cannot finish before 64 blocks have been merged, this behaviour will repeat itself.

Follow-up PR will handle that by temporarily not pruning so many changesets, so we can leverage the incremental branches of these stages. But it requires a few too many changes for this PR

Copy link
Collaborator

@mattsse mattsse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit re duplicate condition,

otherwise lgtm

crates/primitives/src/prune/part.rs Outdated Show resolved Hide resolved

/// Part of the data that can be pruned.
#[main_codec]
#[derive(Debug, Clone, Copy, Eq, PartialEq, Ord, PartialOrd)]
#[derive(Debug, Display, Clone, Copy, Eq, PartialEq, Ord, PartialOrd)]
pub enum PrunePart {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shekhirin what do you think about renaming this to PruneStep?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally, "Step" sounds like a consecutive action like "Stage" in the context of pipeline, while in pruning you can enable different parts and it doesn't really matter in which order and composition they are executed. But I do agree that "Part" is weird too.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe PruneComponent?

Comment on lines +165 to +166
if to_block - from_block > self.clean_threshold || from_block == 1 || !has_enough_changesets
{
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we have the same check twice,

is it worth it to move this to a separate function and properly document it?

@joshieDo joshieDo added this pull request to the merge queue Jul 31, 2023
Merged via the queue into main with commit 134fe81 Jul 31, 2023
@joshieDo joshieDo deleted the pruning/hist branch July 31, 2023 14:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-db Related to the database C-enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Respect Storage History prune part in the pipeline Respect Account History prune part in the pipeline
3 participants