Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Quorum Store] Implementation of quorum store components #6055

Merged
merged 395 commits into from
Feb 9, 2023

Conversation

bchocho
Copy link
Contributor

@bchocho bchocho commented Jan 3, 2023

Description

Implementation of Quorum Store. See component diagram at https://drive.google.com/file/d/1Vu3G_z6zOueljBnnPLZp4VIZp4Oo_AwQ/view?usp=sharing

Quorum store is still disabled by default (via onchain config).

Test Plan

All tests pass without quorum store enabled.

With quorum store enabled (by hard-coding the onchain config):

  • All unit tests pass, except twins tests. It's tricky to enable twins to run in both modes, we can clean this up when quorum store becomes the default.
  • All smoke tests pass, except test_upgrade_flow. This needs some investigation.
  • Land-blocking forge tests pass.
  • Consensus-only forge tests passes.

Before enabling quorum store, we additionally need to:

  • Pass all forge non-flaky tests when quorum store = true
  • Add new tests:
    • Smoke test for onchain config flip
    • Forge test for onchain config flip
    • New failpoint tests
    • Smoke/forge test for a malicious node running a different quorum store mode

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions
Copy link
Contributor

github-actions bot commented Feb 7, 2023

✅ Forge suite land_blocking success on fc7cfc26d2e2e7204fbbb65e2769a08558168242

performance benchmark with full nodes : 5888 TPS, 6709 ms latency, 10500 ms p99 latency,(!) expired 700 out of 2514960 txns
Test Ok

@github-actions
Copy link
Contributor

github-actions bot commented Feb 7, 2023

✅ Forge suite compat success on testnet_2d8b1b57553d869190f61df1aaf7f31a8fc19a7b ==> fc7cfc26d2e2e7204fbbb65e2769a08558168242

Compatibility test results for testnet_2d8b1b57553d869190f61df1aaf7f31a8fc19a7b ==> fc7cfc26d2e2e7204fbbb65e2769a08558168242 (PR)
1. Check liveness of validators at old version: testnet_2d8b1b57553d869190f61df1aaf7f31a8fc19a7b
compatibility::simple-validator-upgrade::liveness-check : 7783 TPS, 4948 ms latency, 7200 ms p99 latency,no expired txns
2. Upgrading first Validator to new version: fc7cfc26d2e2e7204fbbb65e2769a08558168242
compatibility::simple-validator-upgrade::single-validator-upgrade : 5227 TPS, 7602 ms latency, 9300 ms p99 latency,no expired txns
3. Upgrading rest of first batch to new version: fc7cfc26d2e2e7204fbbb65e2769a08558168242
compatibility::simple-validator-upgrade::half-validator-upgrade : 4520 TPS, 8894 ms latency, 11100 ms p99 latency,no expired txns
4. upgrading second batch to new version: fc7cfc26d2e2e7204fbbb65e2769a08558168242
compatibility::simple-validator-upgrade::rest-validator-upgrade : 6873 TPS, 5628 ms latency, 11300 ms p99 latency,no expired txns
5. check swarm health
Compatibility test for testnet_2d8b1b57553d869190f61df1aaf7f31a8fc19a7b ==> fc7cfc26d2e2e7204fbbb65e2769a08558168242 passed
Test Ok

@github-actions
Copy link
Contributor

github-actions bot commented Feb 7, 2023

✅ Forge suite consensus_only_perf_benchmark success on consensus_only_perf_test_fc7cfc26d2e2e7204fbbb65e2769a08558168242 ==> fc7cfc26d2e2e7204fbbb65e2769a08558168242

Test Ok

@igor-aptos igor-aptos self-assigned this Feb 8, 2023
Copy link
Contributor

@igor-aptos igor-aptos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall, looks good! I've made bunch of comments/questions/suggestions inline, but even if you want to address any of those, you can do so in the followup PR, I am accepting as this one is huge, and there is nothing important that is off.

@@ -48,10 +49,10 @@ pub struct ChainHealthBackoffValues {
impl Default for ConsensusConfig {
fn default() -> ConsensusConfig {
ConsensusConfig {
max_sending_block_txns: 2500,
max_sending_block_txns: 4000,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, we cannot really change these now.

either :

  • keep them as is, and then increase in the next release once QS is enabled in main
  • or have new fields (qs_max_sending_block_txns, and others) in the interrim, before the cleanup)


#[derive(Clone, Debug, Deserialize, PartialEq, Serialize)]
#[serde(default, deny_unknown_fields)]
pub struct QuorumStoreConfig {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add some more comments what are the constants and what are they referring to? for example what channel is "channel_size" here

@@ -10,8 +10,7 @@ use aptos_crypto::HashValue;
use futures::channel::oneshot;
use std::{fmt, fmt::Formatter};

/// Message sent from Consensus to QuorumStore.
pub enum PayloadRequest {
pub enum BlockProposalCommand {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be "GetBlockProposalCommand" ?

proposal generator issues this, and then it proposes there. this command prepares the proposal only?

@@ -302,6 +304,12 @@ impl NetworkSender {

#[async_trait::async_trait]
impl QuorumStoreSender for NetworkSender {
async fn send_batch_request(&self, request: BatchRequest, recipients: Vec<Author>) {
fail_point!("consensus::send_batch_request", |_| ());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use the same convention :
consensus::send_batch_request => consensus::send::batch_request

and for consensus::send_batch and consensus::send_signed_digest below

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! #6650


pub enum QuorumStoreBuilder {
DirectMempool(DirectMempoolInnerBuilder),
InQuorumStore(InnerBuilder),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does "In" here refer to?

{
let batch_coordinator = BatchCoordinator::new(
self.epoch,
self.author,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I maybe missing something, but shouldn't remote batch coordinator be creating batches authored by a remote node , not local peer id? Or is this used for something else?

for (i, remote_batch_coordinator_cmd_rx) in
self.remote_batch_coordinator_cmd_rx.into_iter().enumerate()
{
let batch_coordinator = BatchCoordinator::new(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the reason for having separate batch coordinator for each "batch author"?

Is it because we want to process each stream independently (i.e. have 100 receiver loops), or because it makes BatchCoordinator code cleaner to care about the single author? If the second, should we have a single channel, and have a dispatcher to give the message to appropriate coordinator, instead of having 100 loops?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We create batch coordinator workers == num_workers_for_remote_fragments (which can be smaller than the number of peers).

The reason is closer to the former. Most importantly, we want the remote fragments not to block the local fragments (which is even accomplished with == 1). Then we want the remote fragments to not block each other.

digest_to_proof: HashMap<HashValue, IncrementalProofState>,
digest_to_time: HashMap<HashValue, u64>,
// to record the batch creation time
timeouts: DigestTimeouts,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this confused me, these is not recording what has timeouted, but at what point in the future something will expire

maybe expirations is better?


fn expire(&mut self) {
for digest in self.timeouts.expire() {
if let Some(state) = self.digest_to_proof.remove(&digest) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add comment:

// check if proof hasn't completed already

as that is the reason for there not being a value

self.remote_batch_coordinator_tx.len(),
idx
);
self.remote_batch_coordinator_tx[idx]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where does the remote_batch_coordinator receive EndBatch?

also if processing remote and local fragments is different, why don't we have two classes - LocalBatchCoordinator and RemoteBatchCoordinator, if they have no overlap?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zekun000 had the same comment. I'll create an item to work on this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CICD:build-consensus-only-image CICD:run-consensus-only-perf-test Builds consensus-only aptos-node image and uses it to run forge CICD:run-e2e-tests when this label is present github actions will run all land-blocking e2e tests from the PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants