Skip to content

Commit

Permalink
[leader election] Improve leader election for first 20 rounds (aptos-…
Browse files Browse the repository at this point in the history
…labs#5683)

exclude_rounds is 20 by default, which means we don't consider last
20 rounds in leader reputation history - but we do so from the same epoch -
we consider everyone is caught up with the previous epoch.

genesis is epoch=0, round=0
first block is epoch=1 round=1, and that is a reconfig, that triggers epoch 2.
because votes in block metadata are for previous round, they are empty in that round.
So in that round - only "active" node is proposer.

For first 20 rounds of epoch 2, that is the only history we have - and so proposer of
epoch=1 round=1 will always be elected for the first 20 rounds.

That can cause issues in tests, as they might be executed over very few rounds.
So removing epoch=1,round=1 from being considered for history in epoch 2.

We can make this change without backward compatible gating, as this only affects
first window rounds (10 * num validators), and chain will continue successfully
after that (and all active chains are pass the epoch 2)
  • Loading branch information
igor-aptos authored Nov 23, 2022
1 parent f1ac449 commit 49d77e6
Show file tree
Hide file tree
Showing 3 changed files with 23 additions and 5 deletions.
8 changes: 6 additions & 2 deletions consensus/src/epoch_manager.rs
Original file line number Diff line number Diff line change
Expand Up @@ -254,9 +254,12 @@ impl EpochManager {
vec![1; proposers.len()]
};

// First block (after genesis) is epoch=1, so that is the first epoch we consider. (Genesis is epoch=0)
// Genesis is epoch=0
// First block (after genesis) is epoch=1, and is the only block in that epoch.
// It has no votes, so we skip it unless we are in epoch 1, as otherwise it will
// skew leader elections for exclude_round number of rounds.
let first_epoch_to_consider = std::cmp::max(
1,
if epoch_state.epoch == 1 { 1 } else { 2 },
epoch_state
.epoch
.saturating_sub(use_history_from_previous_epoch_max_count as u64),
Expand Down Expand Up @@ -300,6 +303,7 @@ impl EpochManager {
));
// LeaderReputation is not cheap, so we can cache the amount of rounds round_manager needs.
Box::new(CachedProposerElection::new(
epoch_state.epoch,
proposer_election,
onchain_config.max_failed_authors_to_store()
+ PROPSER_ELECTION_CACHING_WINDOW_ADDITION,
Expand Down
19 changes: 16 additions & 3 deletions consensus/src/liveness/cached_proposer_election.rs
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@
use std::collections::BTreeMap;

use aptos_infallible::Mutex;
use aptos_logger::prelude::info;
use consensus_types::common::{Author, Round};

use crate::counters::PROPOSER_ELECTION_DURATION;
Expand All @@ -15,6 +16,7 @@ use super::proposer_election::ProposerElection;
// Function get_valid_proposer can be expensive, and we want to make sure
// it is computed only once for a given round.
pub struct CachedProposerElection {
epoch: u64,
proposer_election: Box<dyn ProposerElection + Send + Sync>,
// We use BTreeMap since we want a fixed window of cached elements
// to look back (and caller knows how big of a window it needs).
Expand All @@ -25,8 +27,13 @@ pub struct CachedProposerElection {
}

impl CachedProposerElection {
pub fn new(proposer_election: Box<dyn ProposerElection + Send + Sync>, window: usize) -> Self {
pub fn new(
epoch: u64,
proposer_election: Box<dyn ProposerElection + Send + Sync>,
window: usize,
) -> Self {
Self {
epoch,
proposer_election,
recent_elections: Mutex::new(BTreeMap::new()),
window,
Expand All @@ -42,8 +49,14 @@ impl CachedProposerElection {

*recent_elections.entry(round).or_insert_with(|| {
let _timer = PROPOSER_ELECTION_DURATION.start_timer();
self.proposer_election
.get_valid_proposer_and_voting_power_participation_ratio(round)
let result = self
.proposer_election
.get_valid_proposer_and_voting_power_participation_ratio(round);
info!(
"ProposerElection for epoch {} and round {}: {:?}",
self.epoch, round, result
);
result
})
}
}
Expand Down
1 change: 1 addition & 0 deletions consensus/src/liveness/cached_proposer_election_test.rs
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@ fn test_get_valid_proposer_caching() {
let asked = Arc::new(Mutex::new(Cell::new(0)));
let authors: Vec<Author> = (0..4).map(|_| Author::random()).collect();
let cpe = CachedProposerElection::new(
1,
Box::new(MockProposerElection::new(authors.clone(), asked.clone())),
10,
);
Expand Down

0 comments on commit 49d77e6

Please sign in to comment.