Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[dag] broadcast CertifiedNodeMsg with LedgerInfo #9968

Merged
merged 4 commits into from
Sep 11, 2023

Conversation

ibalajiarun
Copy link
Contributor

@ibalajiarun ibalajiarun commented Sep 7, 2023

Description

This PR enables broadcasting latest ledger info with the CertifiedNode for state syncing. It obtains the ledger info from the storage adapter.

Test Plan

Fixed unit tests.

@ibalajiarun ibalajiarun force-pushed the balaji/dag-state-sync branch 2 times, most recently from fab0ad8 to 6dd6796 Compare September 7, 2023 23:41
Base automatically changed from balaji/dag-state-sync to main September 8, 2023 14:45
@ibalajiarun ibalajiarun force-pushed the balaji/bcast-certified-node-msg branch from a5d3321 to 89c4dec Compare September 8, 2023 16:08
@ibalajiarun ibalajiarun marked this pull request as ready for review September 8, 2023 16:10
@@ -151,12 +155,19 @@ impl DagDriver {
let signature_builder =
SignatureBuilder::new(node.metadata().clone(), self.epoch_state.clone());
let cert_ack_set = CertificateAckState::new(self.epoch_state.verifier.len());
let latest_ledger_info = self
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

carrying a db here is a bit ugly, I was thinking about caching one in the notifier adapter and rely on callback to update it, wdyt

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think we anyhow need the StorageAdapter here right? we can just use that instead. we can cache within the storage adapter if necessary?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds better, just expose a function on the DAGStorage?

Copy link
Contributor

@sasha8 sasha8 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@@ -92,7 +92,7 @@ impl NetworkHandler {
.map(|r| r.into()),
DAGMessage::CertifiedNodeMsg(node) => node
.verify(&self.epoch_state.verifier)
.and_then(|_| self.dag_driver.process(node))
.and_then(|_| self.dag_driver.process(node.certified_node()))
Copy link
Contributor

@sasha8 sasha8 Sep 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a bit confusing that the node has certified_node that has a node...
Maybe change the name of the variable?

@ibalajiarun ibalajiarun force-pushed the balaji/bcast-certified-node-msg branch from 1db0e3f to a2a16b5 Compare September 11, 2023 15:32
@ibalajiarun ibalajiarun enabled auto-merge (squash) September 11, 2023 19:01
@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@ibalajiarun ibalajiarun enabled auto-merge (squash) September 11, 2023 19:35
@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions
Copy link
Contributor

✅ Forge suite compat success on aptos-node-v1.6.2 ==> 1b28c10ad0fdcebc681546ea1e2dbbbfa29c5c2a

Compatibility test results for aptos-node-v1.6.2 ==> 1b28c10ad0fdcebc681546ea1e2dbbbfa29c5c2a (PR)
1. Check liveness of validators at old version: aptos-node-v1.6.2
compatibility::simple-validator-upgrade::liveness-check : committed: 4557 txn/s, latency: 7225 ms, (p50: 7100 ms, p90: 10200 ms, p99: 12900 ms), latency samples: 173200
2. Upgrading first Validator to new version: 1b28c10ad0fdcebc681546ea1e2dbbbfa29c5c2a
compatibility::simple-validator-upgrade::single-validator-upgrade : committed: 1639 txn/s, latency: 16465 ms, (p50: 17500 ms, p90: 22200 ms, p99: 43100 ms), latency samples: 86900
3. Upgrading rest of first batch to new version: 1b28c10ad0fdcebc681546ea1e2dbbbfa29c5c2a
compatibility::simple-validator-upgrade::half-validator-upgrade : committed: 1381 txn/s, latency: 16665 ms, (p50: 18800 ms, p90: 22300 ms, p99: 38700 ms), latency samples: 92540
4. upgrading second batch to new version: 1b28c10ad0fdcebc681546ea1e2dbbbfa29c5c2a
compatibility::simple-validator-upgrade::rest-validator-upgrade : committed: 3443 txn/s, latency: 9140 ms, (p50: 10200 ms, p90: 12600 ms, p99: 13300 ms), latency samples: 137740
5. check swarm health
Compatibility test for aptos-node-v1.6.2 ==> 1b28c10ad0fdcebc681546ea1e2dbbbfa29c5c2a passed
Test Ok

@github-actions
Copy link
Contributor

✅ Forge suite realistic_env_max_load success on 1b28c10ad0fdcebc681546ea1e2dbbbfa29c5c2a

two traffics test: inner traffic : committed: 6359 txn/s, latency: 6123 ms, (p50: 6000 ms, p90: 7500 ms, p99: 11400 ms), latency samples: 2785260
two traffics test : committed: 100 txn/s, latency: 3224 ms, (p50: 3100 ms, p90: 4000 ms, p99: 6000 ms), latency samples: 1840
Latency breakdown for phase 0: ["QsBatchToPos: max: 0.277, avg: 0.218", "QsPosToProposal: max: 0.155, avg: 0.147", "ConsensusProposalToOrdered: max: 0.610, avg: 0.573", "ConsensusOrderedToCommit: max: 0.509, avg: 0.492", "ConsensusProposalToCommit: max: 1.099, avg: 1.066"]
Max round gap was 1 [limit 4] at version 820659. Max no progress secs was 3.576528 [limit 10] at version 2824386.
Test Ok

@github-actions
Copy link
Contributor

✅ Forge suite framework_upgrade success on aptos-node-v1.5.1 ==> 1b28c10ad0fdcebc681546ea1e2dbbbfa29c5c2a

Compatibility test results for aptos-node-v1.5.1 ==> 1b28c10ad0fdcebc681546ea1e2dbbbfa29c5c2a (PR)
Upgrade the nodes to version: 1b28c10ad0fdcebc681546ea1e2dbbbfa29c5c2a
framework_upgrade::framework-upgrade::full-framework-upgrade : committed: 5132 txn/s, latency: 6290 ms, (p50: 6600 ms, p90: 9600 ms, p99: 11700 ms), latency samples: 189920
5. check swarm health
Compatibility test for aptos-node-v1.5.1 ==> 1b28c10ad0fdcebc681546ea1e2dbbbfa29c5c2a passed
Test Ok

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants