Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replay stage can consider a slot to be completed before the blocktree slot is full #6498

Closed
jstarry opened this issue Oct 22, 2019 · 11 comments · Fixed by #6833
Closed

Replay stage can consider a slot to be completed before the blocktree slot is full #6498

jstarry opened this issue Oct 22, 2019 · 11 comments · Fixed by #6833
Assignees

Comments

@jstarry
Copy link
Member

jstarry commented Oct 22, 2019

Problem

Replay stage does not take into account whether blocktree has gotten the signal that a slot is full. This means that the replay stage may consider a block to be valid before blocktree has received all of the entries for a slot.

After #6262 we now consider a block with trailing entries after the final tick to be invalid. So, two different nodes which receive entries in different slices may reach different conclusions about whether a block is valid.

Proposed Solution

Do not consider a bank to be completed in the replay stage until the blocktree slot is full

@jstarry
Copy link
Member Author

jstarry commented Oct 22, 2019

From #6262 (comment):

replay_stage (via last tick) and blocktree (via last shred) should both think the slot is complete in order for the block to be accepted

@carllin
Copy link
Contributor

carllin commented Oct 22, 2019

@jstarry, what happens if a malicious leader sends two different versions of this slot to different sets of validators, one with trailing entries, one without?

A variant of this: #6362.

This is an interesting variant because now this introduces an attack vector where you can have two different versions of the same slot, but you there is not two versions of a particular shred index (one transmission is just a subset of the other).

@carllin
Copy link
Contributor

carllin commented Oct 22, 2019

Maybe the right solution here is to allow the validator to play the slot (not mark it as dead), but submit the trailing entries as proof for slashing. That way you don't get divergent behavior.

@aeyakovenko
Copy link
Member

@carllin some validators may vote on this block, since they may not receive the trailing entries. What is the signal that the slot is full, vs the final tick? I think the better solution would be to make the final tick be the signal, and any shreds after are ignored. We can add a slashing rule for generating spurious shreds.

@carllin
Copy link
Contributor

carllin commented Oct 22, 2019

I think @jstarry's PR will allow replay stage to keep track of when the last tick is received.

The signal that the slot is full is when blocktree receives a shred i marked as last and we've received all shreds between 0 and i.

The final tick is only known after deshredding, in @jstarry's PR this is known in replay_stage after the number of ticks is equal to the bank.max_tick_height().

So basically:

  1. replay_stage will need to count number of ticks
  2. replay_stage tells blocktree after tick_height == max_tick_height() that this slot is complete at shred index "i"
  3. If there are other shreds that exist past slot "i" in blocktree, then that is a slashing condition, create a slashing proof. Seems like this slashing proof would be pretty large because it would require sending all the shreds required to reconstruct all the ticks in the slot.

@aeyakovenko
Copy link
Member

@carllin for #3, that would be a PIA. We can make the shred signature sign the slot + shred index + last tick hash in the shred that contains it. So the fraud proof would be showing that signature + a singature that signs a slot + shred index greater that the last tick one. The data for the last tick must fit in the last shred.

@carllin
Copy link
Contributor

carllin commented Oct 22, 2019

@aeyakovenko

These situations below also seem covered, just double checking:

  1. If the leader sends too few ticks, aka prematurely marks a tick as final, then his slot will never be completed/frozen.

  2. In the case the leader sends a group of entries where the total num_hashes > hashes_per_tick, but they don't send a tick, I guess then the fork is marked as dead. And unless they send another version of the same slot (slashable), this slot will die

What's left:

What happens if a leader sends more ticks than he's supposed to, and signs the last of those? I think ideally we would also like to slash if the ticks_per_slot tick in a slot isn't marked as "final"

@aeyakovenko
Copy link
Member

@carllin want to write up the proposal? I think this PR can move forward without a slashing implementation

@carllin
Copy link
Contributor

carllin commented Oct 23, 2019

@aeyakovenko, sure but I'm missing one last step for this proposal: What happens if a leader sends more ticks than he's supposed to, and only signs the last of those? This boils down to figuring out how to provide a fraud proof if the ticks_per_slot-indexed tick in a slot isn't marked as "final".

I think the main question about this PR is whether slots with excess shreds after the ticks_per_slot-indexed tick should be marked as dead. From the discussion here, it seems like the answer is no, which means @jstarry should revert that section of the change before checking the rest in.

@aeyakovenko
Copy link
Member

@carllin the block is not valid without the last tick. One way to do this is to add a fin tx that is signed by the leader and is added the hash prior to the last tick.

@carllin
Copy link
Contributor

carllin commented Oct 23, 2019

@aeyakovenko, ah yeah, so if the validator doesn't mark the shred containing the last tick with the "last" flag, we can say that slot is dead, even if they sent all the ticks. That way they can't send excess ticks an still have a valid block. That should cover it. Ok I'll make the proposal! @jstarry, this proposal should cover which cases should/should not warrant a block being marked as dead in your PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants