Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Skip leader slots until a vote lands #15607

Merged
merged 1 commit into from
Mar 26, 2021

Conversation

sakridge
Copy link
Member

@sakridge sakridge commented Mar 1, 2021

Problem

Nodes starting up after clearing their ledger can double sign for a slot when they start their leader slot again and are not aware they already produced a block for that slot.

Summary of Changes

Check to see if we have landed a vote since starting the node and wait to produce leader slots until then.

Fixes #

@sakridge
Copy link
Member Author

sakridge commented Mar 1, 2021

@carllin does it look reasonable at all?

core/src/replay_stage.rs Outdated Show resolved Hide resolved
@sakridge sakridge force-pushed the leader-start-until-vote branch from 8aa468f to f5cfa8f Compare March 1, 2021 23:47
@carllin
Copy link
Contributor

carllin commented Mar 1, 2021

hmmm so if somebody is blowing away their ledger and restarting, I'm not sure this would help, i.e.

0 (root)----1---2---3 (leader slot)

They then blow away the ledger, vote on slot 1, set has_voted to true on slot 2, and recreate another version of the leader slot on slot 3.

@sakridge
Copy link
Member Author

sakridge commented Mar 1, 2021

hmmm so if somebody is blowing away their ledger and restarting, I'm not sure this would help, i.e.

0 (root)----1---2---3 (leader slot)

They then blow away the ledger, vote on slot 1, set has_voted to true on slot 2, and recreate another version of the leader slot on slot 3.

Slot 2 will be frozen already if they created slot 3 on top of it. So they won't be able to land their vote for slot 1 in slot 2. The has_voted tower state maybe is the problem. It actually does need to check the specific vote signatures to see if they landed when it came up the 2nd time.

@carllin
Copy link
Contributor

carllin commented Mar 1, 2021

hmmm so if somebody is blowing away their ledger and restarting, I'm not sure this would help, i.e.
0 (root)----1---2---3 (leader slot)
They then blow away the ledger, vote on slot 1, set has_voted to true on slot 2, and recreate another version of the leader slot on slot 3.

Slot 2 will be frozen already if they created slot 3 on top of it. So they won't be able to land their vote for slot 1 in slot 2. The has_voted tower state maybe is the problem. It actually does need to check the specific vote signatures to see if they landed when it came up the 2nd time.

yeah it seems at the very least you'll have to check that the vote signature landed in a rooted fork to ensure you're near the tip.

Also of note, I think there's still an edge case if you were a bit ahead of the major fork on a different fork, i.e.

             1-2----20 (your leader slot) (minor fork)
         /
(root)
         \
            3---4-- 5 (major fork) --RESTART-- 6---7

If you restart on the major fork and vote on slot 6, which lands in slot 7, you may recreate your vote slot again for slot 20 on the major fork.

@mvines
Copy link
Member

mvines commented Mar 2, 2021

I was thinking we'd suppress this behavior when --wait-for-supermajority is given for the restart case

@sakridge sakridge force-pushed the leader-start-until-vote branch 3 times, most recently from ffa2626 to e6c7867 Compare March 2, 2021 04:29
@sakridge sakridge added the CI Pull Request is ready to enter CI label Mar 2, 2021
@sakridge sakridge force-pushed the leader-start-until-vote branch from e6c7867 to a78cb26 Compare March 2, 2021 04:38
@solana-grimes solana-grimes removed the CI Pull Request is ready to enter CI label Mar 2, 2021
@sakridge sakridge added the CI Pull Request is ready to enter CI label Mar 2, 2021
@sakridge sakridge force-pushed the leader-start-until-vote branch from a78cb26 to e654f4b Compare March 13, 2021 17:56
@solana-grimes solana-grimes removed the CI Pull Request is ready to enter CI label Mar 13, 2021
@sakridge
Copy link
Member Author

I need to handle the case where the node is the sole bootstrap leader or maybe 1 of many or maybe any case where the node might not be able to land a vote and needs to produce slots anyway. I was trying to think if there is an elegant way to figure that out.

@sakridge sakridge force-pushed the leader-start-until-vote branch 2 times, most recently from eab57d7 to 73dced3 Compare March 13, 2021 23:00
@sakridge sakridge marked this pull request as ready for review March 14, 2021 00:31
@codecov
Copy link

codecov bot commented Mar 14, 2021

Codecov Report

Merging #15607 (dac0512) into master (e817a6d) will increase coverage by 0.0%.
The diff coverage is 72.0%.

@@           Coverage Diff           @@
##           master   #15607   +/-   ##
=======================================
  Coverage    80.0%    80.0%           
=======================================
  Files         410      410           
  Lines      109070   109102   +32     
=======================================
+ Hits        87338    87389   +51     
+ Misses      21732    21713   -19     

@sakridge sakridge requested review from carllin and t-nelson March 16, 2021 23:34
t-nelson
t-nelson previously approved these changes Mar 19, 2021
Copy link
Contributor

@t-nelson t-nelson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm with nits!

@@ -293,6 +296,8 @@ impl ReplayStage {
let mut partition_exists = false;
let mut skipped_slots_info = SkippedSlotsInfo::default();
let mut replay_timing = ReplayTiming::default();
let mut voted_signatures = Vec::new();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
let mut voted_signatures = Vec::new();
let mut voted_signatures = Vec::with_capacity(201);

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why 201?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I chose a 200 signature limit for how large this can be. I'm not sure we need to size it initially, most cases should not use the full 200 and the path isn't that performance sensitive.

validator/src/main.rs Show resolved Hide resolved
core/src/replay_stage.rs Outdated Show resolved Hide resolved
@@ -1360,6 +1360,13 @@ pub fn main() {
.help("After processing the ledger and the next slot is SLOT, wait until a \
supermajority of stake is visible on gossip before starting PoH"),
)
.arg(
Arg::with_name("no_wait_for_vote_to_start_leader")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might be good to add a warning here that if > 33% of the cluster goes down and restarts, everyone will need to set this flag, otherwise even on restart nobody will make progress.

Copy link
Member Author

@sakridge sakridge Mar 20, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If 33% goes down, we have to do a manual restart with --wait-for-supermajority flag, right? In that case we detect that we had to wait for supermajority and we skip this check for the vote. So, it shouldn't be necessary to use it in that case. The only case we need it is if you are starting a bootstrap leader(s) without --wait-for-supermajority

core/src/validator.rs Outdated Show resolved Hide resolved
@@ -627,16 +629,19 @@ impl Validator {
check_poh_speed(&genesis_config, None);
}

if wait_for_supermajority(
let (failed, did_wait) = wait_for_supermajority(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: did_wait -> is_wait_unnecessary, since in some cases like on hard forks, a wait didn't occur but is necessary (or even better is_wait_necessary, but then we have to flip the booleans around below)

if let Some(wait_for_supermajority) = config.wait_for_supermajority {
match wait_for_supermajority.cmp(&bank.slot()) {
std::cmp::Ordering::Less => return false,
std::cmp::Ordering::Less => return (false, false),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is the second bool equal to false ever an acceptable case? I.e. if everyone restarts together and is waiting for their vote to land in a root, nobody will vote, and so everyone will stall right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did_wait can be false, yes. Because nodes could still have --wait-for-supermajority even when the network has passed the wait slot given and they are trying to join the network. They wouldn't wait at all because the slot they loaded from is already past the wait-for slot and the shred version matches. In this case they should wait to vote because they have joined the network that is already started producing blocks.

@sakridge sakridge force-pushed the leader-start-until-vote branch from 73dced3 to d77e912 Compare March 20, 2021 20:45
@sakridge sakridge force-pushed the leader-start-until-vote branch from d77e912 to 3652411 Compare March 23, 2021 00:49
core/src/replay_stage.rs Outdated Show resolved Hide resolved
Copy link
Member

@mvines mvines left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I bet we'll need a no_wait_for_vote_to_start_leader: true in this config block as well for TestValidator:

let validator_config = ValidatorConfig {
rpc_addrs: Some((
SocketAddr::new(IpAddr::V4(Ipv4Addr::new(0, 0, 0, 0)), node.info.rpc.port()),
SocketAddr::new(
IpAddr::V4(Ipv4Addr::new(0, 0, 0, 0)),
node.info.rpc_pubsub.port(),
),
)),
rpc_config,
accounts_hash_interval_slots: 100,
account_paths: vec![ledger_path.join("accounts")],
poh_verify: false, // Skip PoH verification of ledger on startup for speed
snapshot_config: Some(SnapshotConfig {
snapshot_interval_slots: 100,
snapshot_path: ledger_path.join("snapshot"),
snapshot_package_output_path: ledger_path.to_path_buf(),
archive_format: ArchiveFormat::Tar,
snapshot_version: SnapshotVersion::default(),
}),
enforce_ulimit_nofile: false,
warp_slot: config.warp_slot,
bpf_jit: !config.no_bpf_jit,
validator_exit: config.validator_exit.clone(),
..ValidatorConfig::default()
};

@sakridge sakridge force-pushed the leader-start-until-vote branch from 3652411 to e37884f Compare March 23, 2021 01:05
@sakridge
Copy link
Member Author

I bet we'll need a no_wait_for_vote_to_start_leader: true in this config block as well for TestValidator:

let validator_config = ValidatorConfig {
rpc_addrs: Some((
SocketAddr::new(IpAddr::V4(Ipv4Addr::new(0, 0, 0, 0)), node.info.rpc.port()),
SocketAddr::new(
IpAddr::V4(Ipv4Addr::new(0, 0, 0, 0)),
node.info.rpc_pubsub.port(),
),
)),
rpc_config,
accounts_hash_interval_slots: 100,
account_paths: vec![ledger_path.join("accounts")],
poh_verify: false, // Skip PoH verification of ledger on startup for speed
snapshot_config: Some(SnapshotConfig {
snapshot_interval_slots: 100,
snapshot_path: ledger_path.join("snapshot"),
snapshot_package_output_path: ledger_path.to_path_buf(),
archive_format: ArchiveFormat::Tar,
snapshot_version: SnapshotVersion::default(),
}),
enforce_ulimit_nofile: false,
warp_slot: config.warp_slot,
bpf_jit: !config.no_bpf_jit,
validator_exit: config.validator_exit.clone(),
..ValidatorConfig::default()
};

added, thanks

@sakridge sakridge force-pushed the leader-start-until-vote branch from e37884f to 220b5bd Compare March 23, 2021 19:05
@sakridge sakridge added the v1.6 label Mar 23, 2021
@sakridge sakridge closed this Mar 25, 2021
@sakridge sakridge force-pushed the leader-start-until-vote branch from 220b5bd to 2aea352 Compare March 25, 2021 17:21
@sakridge sakridge reopened this Mar 25, 2021
@sakridge sakridge force-pushed the leader-start-until-vote branch from 29ac6eb to dac0512 Compare March 25, 2021 19:48
@sakridge sakridge merged commit b99ae8f into solana-labs:master Mar 26, 2021
@sakridge sakridge deleted the leader-start-until-vote branch March 26, 2021 01:54
mergify bot pushed a commit that referenced this pull request Mar 26, 2021
(cherry picked from commit b99ae8f)

# Conflicts:
#	core/src/consensus.rs
#	core/src/replay_stage.rs
sakridge added a commit that referenced this pull request Mar 26, 2021
mergify bot added a commit that referenced this pull request Mar 26, 2021
(cherry picked from commit b99ae8f)

Co-authored-by: sakridge <[email protected]>
@behzadnouri
Copy link
Contributor

Should have multinode-demo/validator.sh also been updated?
Starting a gce cluster shows that nodes get stuck in

INFO  solana_core::replay_stage] Haven't landed a vote, so skipping my leader slot

@sakridge
Copy link
Member Author

Should have multinode-demo/validator.sh also been updated?
Starting a gce cluster shows that nodes get stuck in

INFO  solana_core::replay_stage] Haven't landed a vote, so skipping my leader slot

let me look

@sakridge
Copy link
Member Author

Should have multinode-demo/validator.sh also been updated?
Starting a gce cluster shows that nodes get stuck in

INFO  solana_core::replay_stage] Haven't landed a vote, so skipping my leader slot

I don't think that is the cause of it. Those validators are not staked at the beginning, so they can wait to land a rooted vote. I had that print in a bad spot but: #16156 should fix it.

behzadnouri added a commit to behzadnouri/solana that referenced this pull request Mar 26, 2021
behzadnouri added a commit to behzadnouri/solana that referenced this pull request Mar 26, 2021
@behzadnouri
Copy link
Contributor

I see buildkite/solana/stable failing but it passes when I revert this commit.
Also on gce it seems to have impacted replay metrics.
Left is master, right is this commit reverted.

replay-slot-stats::total_shreds:
replay-slot-stats total_shreds

replay_stage-replay-transactions:
replay_stage-replay-transactions

@sakridge
Copy link
Member Author

sakridge commented Mar 26, 2021

I see buildkite/solana/stable failing but it passes when I revert this commit.
Also on gce it seems to have impacted replay metrics.
Left is master, right is this commit reverted.

can you try with #16156

@behzadnouri
Copy link
Contributor

#16156 is also failing buildkite/solana/stable:
https://buildkite.com/solana-labs/solana/builds/42989

sakridge added a commit to sakridge/solana that referenced this pull request Mar 26, 2021
@sakridge
Copy link
Member Author

Stabled passed here with it:
https://buildkite.com/solana-labs/solana/builds/42978
https://buildkite.com/solana-labs/solana/builds/42993
https://buildkite.com/solana-labs/solana/builds/42970
https://buildkite.com/solana-labs/solana/builds/42962
https://buildkite.com/solana-labs/solana/builds/42962

It looks like a flaky condition in the test to me, it shouldn't really affect that test. The test doesn't actually trigger any of the modified paths of this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants