Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Storage][Pruner] Split state k/v pruning to a separate pruner. #6829

Merged
merged 3 commits into from
Mar 6, 2023

Conversation

grao1991
Copy link
Contributor

@grao1991 grao1991 commented Mar 1, 2023

Description

  1. Rename state pruner to state merkle pruner.
  2. Split state k/v pruner from ledger pruner.
  3. Change StateValueWriter to use state_kv_db.

Test Plan

@grao1991 grao1991 force-pushed the grao_kv_pruner branch 8 times, most recently from 2f63e52 to 2a008bc Compare March 1, 2023 02:14
@grao1991 grao1991 marked this pull request as ready for review March 1, 2023 02:14
@grao1991 grao1991 requested a review from areshand March 1, 2023 02:15
@grao1991 grao1991 force-pushed the grao_kv_pruner branch 4 times, most recently from 898b3fd to 002e105 Compare March 1, 2023 03:20
self.state_store.clone(),
&mut batch,
)?;
LedgerPruner::prune_genesis(self.ledger_db.clone(), &mut batch)?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

curious why we need to prune ledger, statekv while not seeing state_merkel's genesis being pruned?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do. see ~7 lines below.


// Loop that does the real pruning job.
pub(crate) fn work(&self) {
while !self.quit_worker.load(Ordering::Relaxed) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are there multiple threads running the same StateKvPrunerWorker?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, there is only one worker. We use atomic variable here just because the stop_pruning signal is from other thread (not the worker thread).

@grao1991 grao1991 requested a review from areshand March 1, 2023 07:07
Comment on lines 176 to 177
/// Similar to the variable above but for state store pruner. It means the number of stale
/// nodes to prune a time.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

update comment

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines 156 to 159
assert_eq!(peers_with_ready_subscriptions, vec![(
peer_network_1,
synced_ledger_info
)]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm.. nit: previous commit needs cargo +nightly fmt

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines -443 to -436
if pruned {
assert!(state_store.get_usage(Some(version)).is_err())
} else {
assert!(state_store.get_usage(Some(version)).is_ok())
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this still verified somewhere?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add several lines in txn store pruner test to check this.

if index.stale_since_version > target_version {
break;
}
// Prune the stale state value index itself first.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"first"? they are the same batch, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done...

@lightmark originally wrote this LOL

@grao1991 grao1991 force-pushed the grao_kv_pruner branch 3 times, most recently from 78fe983 to 3bd0238 Compare March 4, 2023 00:05
@grao1991 grao1991 requested a review from msmouse March 4, 2023 00:06
let usage = StateStorageUsage::zero();
batch
.put::<VersionDataSchema>(&i, &usage.into())
.expect("Must succeed.");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤣 come on, it's a test, just unwrap()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I felt it was weird when writing expect, but my brain was stuck at that time and couldn't find the unwrap to use. 🤣

@grao1991 grao1991 enabled auto-merge (squash) March 6, 2023 21:53
@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions
Copy link
Contributor

github-actions bot commented Mar 6, 2023

✅ Forge suite framework_upgrade success on cb4ba0a57c998c60cbab65af31a64875d2588ca5 ==> ddec83654581bf351bc472a92a3be6d3a3db6338

Compatibility test results for cb4ba0a57c998c60cbab65af31a64875d2588ca5 ==> ddec83654581bf351bc472a92a3be6d3a3db6338 (PR)
Upgrade the nodes to version: ddec83654581bf351bc472a92a3be6d3a3db6338
framework_upgrade::framework-upgrade::full-framework-upgrade : 7170 TPS, 5382 ms latency, 8000 ms p99 latency,no expired txns
5. check swarm health
Compatibility test for cb4ba0a57c998c60cbab65af31a64875d2588ca5 ==> ddec83654581bf351bc472a92a3be6d3a3db6338 passed
Test Ok

@github-actions
Copy link
Contributor

github-actions bot commented Mar 6, 2023

✅ Forge suite land_blocking success on ddec83654581bf351bc472a92a3be6d3a3db6338

performance benchmark with full nodes : 5839 TPS, 6769 ms latency, 12200 ms p99 latency,(!) expired 1140 out of 2494560 txns
Test Ok

@github-actions
Copy link
Contributor

github-actions bot commented Mar 6, 2023

✅ Forge suite compat success on testnet_2d8b1b57553d869190f61df1aaf7f31a8fc19a7b ==> ddec83654581bf351bc472a92a3be6d3a3db6338

Compatibility test results for testnet_2d8b1b57553d869190f61df1aaf7f31a8fc19a7b ==> ddec83654581bf351bc472a92a3be6d3a3db6338 (PR)
1. Check liveness of validators at old version: testnet_2d8b1b57553d869190f61df1aaf7f31a8fc19a7b
compatibility::simple-validator-upgrade::liveness-check : 8226 TPS, 4640 ms latency, 6700 ms p99 latency,no expired txns
2. Upgrading first Validator to new version: ddec83654581bf351bc472a92a3be6d3a3db6338
compatibility::simple-validator-upgrade::single-validator-upgrade : 4946 TPS, 8036 ms latency, 10700 ms p99 latency,no expired txns
3. Upgrading rest of first batch to new version: ddec83654581bf351bc472a92a3be6d3a3db6338
compatibility::simple-validator-upgrade::half-validator-upgrade : 4802 TPS, 8252 ms latency, 10900 ms p99 latency,no expired txns
4. upgrading second batch to new version: ddec83654581bf351bc472a92a3be6d3a3db6338
compatibility::simple-validator-upgrade::rest-validator-upgrade : 7156 TPS, 5303 ms latency, 9800 ms p99 latency,no expired txns
5. check swarm health
Compatibility test for testnet_2d8b1b57553d869190f61df1aaf7f31a8fc19a7b ==> ddec83654581bf351bc472a92a3be6d3a3db6338 passed
Test Ok

@grao1991 grao1991 merged commit e41788c into main Mar 6, 2023
@grao1991 grao1991 deleted the grao_kv_pruner branch March 6, 2023 23:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants