Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[quorum store] refactor batch_reader/store and batch requester #6785

Merged
merged 1 commit into from
Mar 8, 2023

Conversation

zekun000
Copy link
Contributor

@zekun000 zekun000 commented Feb 26, 2023

This commit merges batch_reader and batch_store into a single struct and uses it as synchronous component (via function calls)
instead of asynchronous one (via channels).

Also batch_requester is refactored to use network rpc directly.

@zekun000 zekun000 added the CICD:run-e2e-tests when this label is present github actions will run all land-blocking e2e tests from the PR label Feb 26, 2023
@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

network_msg_rx: aptos_channel::Receiver<PeerId, VerifiedEvent>,
batch_reader_tx: Sender<BatchReaderCommand>,
batch_reader: Arc<BatchReader<NetworkSender>>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gelash will be so proud

.await
.expect("could not push Batch batch_reader");
if let Ok(value) = self.batch_reader.get_batch_from_local(&request.digest()) {
let batch =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gelash will be so so proud

Batch::new(self.my_peer_id, self.epoch, request.digest(), value);
self.network_sender
.send_batch(batch, vec![request.source()])
.await;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can the await be a performance problem since it delays handling messages on the critical path?

.expect("could not push Batch batch_reader");
self.batch_reader
.receive_batch(batch.digest(), batch.into_payload())
.await;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same for this await

proof.shuffled_signers(&self.validator_verifier),
tx,
)
.await;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment about the await. Now the execution thread is calling it. Before the execution thread can do other stuff in parallel. Do you think it is significant? For example get data fro the other batches in the meantime.


self.self_tx
.send(BatchReaderCommand::GetBatchForSelf(proof, tx))
// TODO: handle timeout
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Critical logic.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

This commit merges batch_reader and batch_store into a single struct and uses it as synchronous component (via function calls)
instead of asynchronous one (via channels).

Also batch_requester is refactored to use network rpc directly.
@zekun000 zekun000 enabled auto-merge (rebase) March 8, 2023 18:29
@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions
Copy link
Contributor

github-actions bot commented Mar 8, 2023

✅ Forge suite consensus_only_perf_benchmark success on consensus_only_perf_test_f77d1d2c5a571b20989d62aaed92d96e2a45e041 ==> f77d1d2c5a571b20989d62aaed92d96e2a45e041

Test Ok

@github-actions
Copy link
Contributor

github-actions bot commented Mar 8, 2023

✅ Forge suite compat success on testnet_2d8b1b57553d869190f61df1aaf7f31a8fc19a7b ==> f77d1d2c5a571b20989d62aaed92d96e2a45e041

Compatibility test results for testnet_2d8b1b57553d869190f61df1aaf7f31a8fc19a7b ==> f77d1d2c5a571b20989d62aaed92d96e2a45e041 (PR)
1. Check liveness of validators at old version: testnet_2d8b1b57553d869190f61df1aaf7f31a8fc19a7b
compatibility::simple-validator-upgrade::liveness-check : 8085 TPS, 4735 ms latency, 7200 ms p99 latency,no expired txns
2. Upgrading first Validator to new version: f77d1d2c5a571b20989d62aaed92d96e2a45e041
compatibility::simple-validator-upgrade::single-validator-upgrade : 4722 TPS, 8601 ms latency, 12100 ms p99 latency,no expired txns
3. Upgrading rest of first batch to new version: f77d1d2c5a571b20989d62aaed92d96e2a45e041
compatibility::simple-validator-upgrade::half-validator-upgrade : 4961 TPS, 7728 ms latency, 10500 ms p99 latency,no expired txns
4. upgrading second batch to new version: f77d1d2c5a571b20989d62aaed92d96e2a45e041
compatibility::simple-validator-upgrade::rest-validator-upgrade : 7018 TPS, 5367 ms latency, 9000 ms p99 latency,no expired txns
5. check swarm health
Compatibility test for testnet_2d8b1b57553d869190f61df1aaf7f31a8fc19a7b ==> f77d1d2c5a571b20989d62aaed92d96e2a45e041 passed
Test Ok

@github-actions
Copy link
Contributor

github-actions bot commented Mar 8, 2023

✅ Forge suite compat success on testnet_2d8b1b57553d869190f61df1aaf7f31a8fc19a7b ==> f77d1d2c5a571b20989d62aaed92d96e2a45e041

Compatibility test results for testnet_2d8b1b57553d869190f61df1aaf7f31a8fc19a7b ==> f77d1d2c5a571b20989d62aaed92d96e2a45e041 (PR)
1. Check liveness of validators at old version: testnet_2d8b1b57553d869190f61df1aaf7f31a8fc19a7b
compatibility::simple-validator-upgrade::liveness-check : 7805 TPS, 4886 ms latency, 7300 ms p99 latency,no expired txns
2. Upgrading first Validator to new version: f77d1d2c5a571b20989d62aaed92d96e2a45e041
compatibility::simple-validator-upgrade::single-validator-upgrade : 4864 TPS, 8496 ms latency, 11700 ms p99 latency,no expired txns
3. Upgrading rest of first batch to new version: f77d1d2c5a571b20989d62aaed92d96e2a45e041
compatibility::simple-validator-upgrade::half-validator-upgrade : 5176 TPS, 7394 ms latency, 10900 ms p99 latency,no expired txns
4. upgrading second batch to new version: f77d1d2c5a571b20989d62aaed92d96e2a45e041
compatibility::simple-validator-upgrade::rest-validator-upgrade : 6834 TPS, 5685 ms latency, 10200 ms p99 latency,no expired txns
5. check swarm health
Compatibility test for testnet_2d8b1b57553d869190f61df1aaf7f31a8fc19a7b ==> f77d1d2c5a571b20989d62aaed92d96e2a45e041 passed
Test Ok

@github-actions
Copy link
Contributor

github-actions bot commented Mar 8, 2023

✅ Forge suite land_blocking success on f77d1d2c5a571b20989d62aaed92d96e2a45e041

performance benchmark with full nodes : 5904 TPS, 6660 ms latency, 12800 ms p99 latency,(!) expired 2180 out of 2523380 txns
Test Ok

@github-actions
Copy link
Contributor

github-actions bot commented Mar 8, 2023

✅ Forge suite consensus_only_perf_benchmark success on consensus_only_perf_test_f77d1d2c5a571b20989d62aaed92d96e2a45e041 ==> f77d1d2c5a571b20989d62aaed92d96e2a45e041

Test Ok

@github-actions

This comment has been minimized.

@github-actions
Copy link
Contributor

github-actions bot commented Mar 8, 2023

✅ Forge suite framework_upgrade success on cb4ba0a57c998c60cbab65af31a64875d2588ca5 ==> f77d1d2c5a571b20989d62aaed92d96e2a45e041

Compatibility test results for cb4ba0a57c998c60cbab65af31a64875d2588ca5 ==> f77d1d2c5a571b20989d62aaed92d96e2a45e041 (PR)
Upgrade the nodes to version: f77d1d2c5a571b20989d62aaed92d96e2a45e041
framework_upgrade::framework-upgrade::full-framework-upgrade : 6984 TPS, 5666 ms latency, 8300 ms p99 latency,no expired txns
5. check swarm health
Compatibility test for cb4ba0a57c998c60cbab65af31a64875d2588ca5 ==> f77d1d2c5a571b20989d62aaed92d96e2a45e041 passed
Test Ok

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CICD:build-consensus-only-image CICD:run-consensus-only-perf-test Builds consensus-only aptos-node image and uses it to run forge CICD:run-e2e-tests when this label is present github actions will run all land-blocking e2e tests from the PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants