Skip to content
This repository has been archived by the owner on Jan 13, 2025. It is now read-only.

Leaders continue building on their own fork after partitioning from the network. #8232

Open
aeyakovenko opened this issue Feb 12, 2020 · 2 comments
Milestone

Comments

@aeyakovenko
Copy link
Member

aeyakovenko commented Feb 12, 2020

Problem

Under normal operation a validator receives the block through turbine. Because the packets are retransmitted by the entire network a validator sees that a vast majority of the network has received at least some portion of the block and will try to repair it. Because of how Turbine works, there is a high probability that if one validator receives the block, then so did at least 50% of the network.

This property doesn’t hold for leaders own blocks. When a leader decides to vote on its own block it does so without any signal that the rest of the network has received the block and will vote as well.

Thus a leader may continue building on its own fork while completely partitioned from the rest of the network. If the leader has a high stake, this could lead to a lockout at a very large height away from the network.

Proposed Solution

From a high level, replay stage should treat its own leaders blocks like the rest of the network. Replay stage needs to take into account data availability of the block before voting on it.

Update consensus design to handle block availability, and how it relates to tendermints propose stage.

  1. Banking stage need its own PoH
  2. Banking stage produces a block on a fork of PoH
  3. Replay stage doesn’t vote on its own block or reset PoH until it sees 1/3+ validators vote on it.
  4. Replay stage doesn’t vote or reset PoH on foreign blocks unless it receives the block with 1/3+ validators retransmitting it.
  5. First retransmit needs to be signed.

Tag: @carllin @sakridge @sagar-solana

This can be broken up into a couple PRs.

Alternative 1

Bank weight doesn’t do a good job accounting for a fork that has the highest chance of being confirmed based on the current set of lockouts and leader schedules. A 1/3- minority bank should never weigh more then 2/3+ bank no matter the lockout. But, a leader should still withhold votes on its own forks until it sees some confirmation that the rest of the network has received its block.

Alternative 2

It might be good enough for leaders to delay voting on their own banks until they see 1/3+ vote on its own bank, which may include its own stake.

  • replay should withhold voting unless 1/3+ retransmitted packets.

Networks with nodes that have 1/3+ stake have to rely on that nodes availability.

Basically prior to voting, this check should pass:

  • is_available == has_superminority_votes || has_superminority_retransmits || has_superminority_epoch_slots

superminority is 1/3+ of the stake weighted validator set

@aeyakovenko
Copy link
Member Author

Seems like we have consensus around alt 2

@mvines mvines modified the milestones: Rincon v0.24.0, v0.25.0 Feb 20, 2020
@mvines mvines modified the milestones: v1.1.0, v1.2.0 Mar 30, 2020
@mvines mvines modified the milestones: v1.2.0, v1.3.0 May 21, 2020
@mvines mvines modified the milestones: v1.3.0, v1.4.0 Aug 5, 2020
@mvines mvines modified the milestones: v1.4.0, v1.5.0 Oct 8, 2020
@ryoqun
Copy link
Contributor

ryoqun commented Oct 25, 2020

Because now we have a persisted tower, we can also prevent leaders starting older snapshot to create duplicate blocks at the banking stage.

ref: #9369 (comment)

@mvines mvines modified the milestones: v1.5.0, v1.6.0 Dec 17, 2020
@mvines mvines modified the milestones: v1.6.0, v1.7.0 Mar 11, 2021
@mvines mvines modified the milestones: v1.7.0, v1.8.0 May 10, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants