Skip to content
This repository has been archived by the owner on Jan 13, 2025. It is now read-only.

Proposal to Handle Duplicate Blocks By Storing Multiple Versions of a Slot #7622

Closed
wants to merge 9 commits into from

Conversation

carllin
Copy link
Contributor

@carllin carllin commented Dec 24, 2019

Problem

No design for slashing duplicate leaders and storing/handling multiple versions of a slot

Summary of Changes

Add such a design

Fixes #

@garious
Copy link
Contributor

garious commented Dec 24, 2019

I'm so lost by all the slashing half-proposals. There's slashing.md already in the book, some open proposal PRs, and a draft PR with code implementing some form of slashing. Can you please work with @aeyakovenko to close obsolete PRs and if it makes sense, replace the existing book content in this PR?

include the blockhash in the repair response. This means we need to include
some repair "cookie" in the request + response that maps some number of bits
to a particular `blockhash`. This breaks down if there are more versions of
this slot than can be tracked by the number of bits allocated to the cookie.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worried about this, also we need to shrink the shred size by the size of this cookie @pgarg66

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it'd be worthwhile to say why "repair responses need to be tied to a particular blockhash", anti-spam?

what if repair responses were xor'd with the request cookie or the blockhash?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Partially anti-spam (although people can still send bogus shreds/bogus cookies, hence @aeyakovenko suggested the merkle tree to figure out if it was the leader or the repair sender who sent the bogus shred), but also to figure out which off those shreds chain to which previous shreds when there are multiple versions of the same slot.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

put the rationale in the document, please?

@carllin carllin force-pushed the DuplicateBlock branch 2 times, most recently from c9e1ed9 to d6fa659 Compare December 24, 2019 02:57
(another column family?). This is important in order to respond to repairs
which will now specify a `blockhash` in addtion to a `slot` and `index`.
Thus if the node crashes before this mapping can be stored, then on restart
the validator needs to recompute the blockhash in `blocktree_processor`.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One downside of this approach is that if we have two versions of a slot, each with 100 shreds, the first 99 of which are the same, and the last different, we have to store 200 copies.

2) These shreds for slot `B` are stored under the version `Hash::default()`
(optimistically assume this child is the only version). If another conflicting
version of `B` is detected before this version is completed, we drop all the
shreds for slot `B`.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to reason through correctness here. If we have multiple versions of a slot C chaining to multiple version of a slot B, chaining to multiple versions of a slot A, will the procedure we have outlined thus far handle it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should do, because we can only chain forward one entry at a time

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tricky part is whether the below is correct:

  1. Dropping any unfinished slots on detecting another version
  2. Waiting on repair to perform chaining on slots with multiple versions based on blockhash

I think it should work as well :D

@carllin
Copy link
Contributor Author

carllin commented Dec 24, 2019

@garious, yup for sure! For clarity this proposal replaces this one in particular: #6362. It does not cover lockout-triggered slashing.

Sorry for the confusion!


1) Wait for all the shreds for the first entry `E_B` of slot `B` to arrive
(Implementation can make sure first shred `S_B` always contains only a tick
to avoid waiting for multiple shreds).
Copy link
Contributor

@rob-solana rob-solana Dec 29, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this option would require that the leader delay ingest for 25% of the slot...

Copy link
Contributor

@rob-solana rob-solana Dec 29, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can add a "first shred in slot" and have that shred carry the parent blockhash and the first tick hash value

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

another option: give up 4-8 bytes of shred space to carry that many bits of parent blockhash. this would solve the repair cookie issues, too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rob-solana, that first option works, marking and having the first shred carry the parent blockhash. Why does it need the first tick hash though?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was proposing alternatives. I think the 4-8 bytes of parent blockhash would be sufficient, ignore others.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sagar-solana, @rob-solana, @aeyakovenko, @ryoqun The next area of concern is surrounding the status cache and how it's packaged into snapshots.

This sounds like a real mess.

  1. The root in blocktree, and thus the root included in the snapshot will now have to carry which version of that slot was rooted (specified by blockhash)
  2. Need to update the status cache to handle multiple versions of the same slot, meaning status cache will have to support a key by (slot, blockhash)
  3. Before we detect a conflicting version of a slot/when the leader is building the block in banking stage, the identifier for a slot in the status cache will be Hash::default(). (Recall the blockhash is known for any versions of the slot after a conflict is detected because we repair those versoins of the slot by blockhash)

Only during catchup or repair will the blockhash be known a-priori, so in the normal case the status cache will have to re-index at the end of each block, too?

Status cache is organized by fork (i.e. is chained in memory via bank construction), right? If it were, would there be a need to key by blockhash?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah its a huge mess 😛

Status cache is a global store shared across all the banks. At a high level, it maps from siganature -> Vec<(Slot, Status)>. When we look up a signature, we pass a vector of ancestors and check if any of those ancestors contain that signature.

Because now we can have many different versions of a single ancestor, we will have to index by blockhash.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, now I remember...

instead of using slots as ancestors in status cache, what if banks assigned themselves a random number and we used that as ancestry in statuscache?

signature->Hashmap<BankID, Status> ?

is set to `Hash::default()` until the blockhash is computed at the end of
the slot. Then the leader will have to store a mapping from
`Hash::default()` to the actual blockhash in a separate area of storage
(another column family?). This is important in order to respond to repairs
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rob-solana this part kind of sucks, any better ideas that don't involve re-indexing at the end of the slot once the slot has been finished?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

blocktree re-implementation where every slot is a directory? when we know the blockhash we do a rename() ?

would a double-lookup from slot+blockhash to slot+randomnumber break ledger portability?

@carllin
Copy link
Contributor Author

carllin commented Dec 30, 2019

@sagar-solana, @rob-solana, @aeyakovenko, @ryoqun. Some remaining design questions:

As suggested by @aeyakovenko, in an attempt to have the majority of the network drop slots that have multiple versions, after we detect a conflicting version of a slot, we drop the slot if it's not yet been voted on. I think this will be done by:

  1. The blocktree insertion thread (maybe some separate conflict detection thread) marks the slot as "conflicting"
  2. Thread from i) sends a signal to ReplayStage over a channel to say this slot is "conflicting"
  3. Window insertion thread refuses to accept any shreds for this slot unless that shred is for a specific
    cookie for a specific blockhash (aka we sent a repair request for this version of this slot).
  4. ReplayStage drops this slot from the progress list if it hasn't yet voted for this slot.
  5. Because of 3), ReplayStage will then not replay this slot unless repair requests are made for this slot.
    Repair requests are only made if another child slot chained back to this slot

@carllin
Copy link
Contributor Author

carllin commented Dec 30, 2019

@sagar-solana, @rob-solana, @aeyakovenko, @ryoqun The next area of concern is surrounding the status cache and how it's packaged into snapshots.

  1. The root in blocktree, and thus the root included in the snapshot will now have to carry which version of that slot was rooted (specified by blockhash)

  2. Need to update the status cache to handle multiple versions of the same slot, meaning status cache will have to support a key by (slot, blockhash)

  3. Before we detect a conflicting version of a slot/when the leader is building the block in banking stage, the identifier for a slot in the status cache will be Hash::default(). (Recall the blockhash is known for any versions of the slot after a conflict is detected because we repair those versoins of the slot by blockhash)

A concern here is that Hash::default() may then mean different things to different validators when the status cache is packaged into a snapshot, but I think this should be fine because the snapshot root is identified by a blockhash, so any future slots a validator plays after booting from the snapshot must chain from that root, and that root has a fixed set of ancestors (version of each slot).

  1. Do we need to clear the status cache of the Hash::default() version of this slot if we are dropping it? I think it may be ok to keep it around.

  2. On replay when multiple versions of a slot are being repaired then replayed, we need to expose the blockhash
    of each version of that slot to the replay logic so that the status cache can be updated properly. Those blockhashes
    are only known by blocktree so they'll have to be propagated.

@rob-solana
Copy link
Contributor

@sagar-solana, @rob-solana, @aeyakovenko, @ryoqun. Some remaining design questions:

As suggested by @aeyakovenko, jn an attempt to have the majority of the network drop slots that have multiple versions, after we detect a conflicting version of a slot, we drop the slot if it's not yet been voted on.

Is this feasible in general? Can replay stage reliably determine whether the rest of the network has accepted some version of a slot?

@carllin
Copy link
Contributor Author

carllin commented Dec 31, 2019

@rob-solana definitely not guaranteed. It's a best effort mechanism, the hope is majority detects the duplicates and drops all versions of that block so that another fork is selected. But there's definitely a race between detecting the duplicates and voting on the block. The bad case is if the heaviest fork includes some version of the duplicate slot, in which case we will have to rely on the repair mechanism in the proposal to allow validators on other forks to catch up.

@rob-solana
Copy link
Contributor

@rob-solana definitely not guaranteed. It's a best effort mechanism, the hope is majority detects the duplicates and drops all versions of that block so that another fork is selected. But there's definitely a race between detecting the duplicates and voting on the block. The bad case is if the heaviest fork includes some version of the duplicate slot, in which case we will have to rely on the repair mechanism in the proposal to allow validators on other forks to catch up.

Ok, so distinguishing between Repair and Turbine is critical. Do we have that capability today?

@carllin
Copy link
Contributor Author

carllin commented Dec 31, 2019

@rob-solana they come in on different ports, which can be pretty easily gamed. But after a validator drops the slot S, they only repair if they see a child slot chain back to this slot S. When a child slot is detected that chains back to this slot:

  1. Validator waits for the first shred in this slot which contains the parent's blockhash

  2. Makes an entry in blocktree for the parent blockhash

  3. Makes repairs for the parent blockhash with some "cookie"

  4. Window service will accept shreds containing the "cookie" mapping to this blockhash.

Everything here from the cookie to the contents of the shred can be pretty easily manipulated here, we might need to make repair more secure (i.e. make sure we only get responses from people we requested things from)

book/src/SUMMARY.md Outdated Show resolved Hide resolved
@garious
Copy link
Contributor

garious commented Jan 10, 2020

@carllin, a few surface-level issues with this PR:

@carllin carllin changed the title Duplicate block Proposal to Handle Duplicate Blocks By Storing Multiple Versions of a Slot Jan 10, 2020
@carllin
Copy link
Contributor Author

carllin commented Jan 10, 2020

@garious thanks, will update!

Both this and #7652 are suggestions for ongoing design processes for consensus v2, so we might have to tolerate both of them being open for a while longer until the research/design phase for this is finalized, sorry!

@stale
Copy link

stale bot commented Jan 17, 2020

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.

@stale stale bot added the stale [bot only] Added to stale content; results in auto-close after a week. label Jan 17, 2020
@stale
Copy link

stale bot commented Jan 25, 2020

This stale pull request has been automatically closed. Thank you for your contributions.

@stale stale bot closed this Jan 25, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
stale [bot only] Added to stale content; results in auto-close after a week.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants