Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

storage: can a Snapshot remove the Raft HardState permanently? #24854

Closed
nvanbenschoten opened this issue Apr 16, 2018 · 3 comments
Closed

storage: can a Snapshot remove the Raft HardState permanently? #24854

nvanbenschoten opened this issue Apr 16, 2018 · 3 comments
Assignees

Comments

@nvanbenschoten
Copy link
Member

replica.applySnapshot claims that the HardState parameter it takes:

may be empty, as Raft may apply some snapshots which don't require an update to the HardState

If this is truly the case then I'm having trouble convincing myself that we don't have a bug. applySnapshot first deletes all replicated and unreplicated keys. It then writes the HardState, but only if the HardState provided by Raft is not empty. How does this work if the HardState provided by Raft is empty? I can't see (and my own testing hasn't found) how a HardState would be written, so it looks like we'll end up with a range without a HardState. This seems really bad, and basic testing has shown that it can lead to this assertion firing (applied(%d) is out of range, not tocommit(%d) is out of range).

The solution to this, if I'm not missing anything, would be to save the HardState before we clear it and write this old HardState again after the clearRange if Raft doesn't provide a new one. However, I'm not even convinced it's possible to raft to provide an empty HardState. This is because this check look like it should prevent a snapshot that doesn't also increase the commit index. I added in an assertion to test this and our entire test suite passes without it firing.

I ran into this question while performing some nearby refactoring. I began digging into the git history before finding that there's been a long history around this so I'd like to get input from those who are more knowledgeable about it before trying to make any changes myself.

cc. @bdarnell (who I already spoke to) @tschottdorf @petermattis

@nvanbenschoten nvanbenschoten self-assigned this Apr 16, 2018
@tbg
Copy link
Member

tbg commented Apr 16, 2018

Looks like we dodged a bullet? I agree that it looks like we're handling the empty HardState incorrectly and that Raft wouldn't ask us to apply such a snapshot. I'm pretty sure that when I wrote this comment I was only afraid that Raft might emit such a HardState, but not sure that it could.

What do you suggest we do? Just reject snapshots with an empty HardState as they shouldn't exist?

@nvanbenschoten
Copy link
Member Author

What do you suggest we do? Just reject snapshots with an empty HardState as they shouldn't exist?

Yes, that's what I was thinking. Adding in logic that won't be exercised seems like a generally bad idea.

@tbg
Copy link
Member

tbg commented Apr 17, 2018

👍

nvanbenschoten added a commit to nvanbenschoten/cockroach that referenced this issue Apr 17, 2018
Fixes cockroachdb#24854.

Before this change, we were incorrectly handling cases where a non-empty
Raft Snapshot included an empty HardState. Luckily, this case doesn't
seem to be possible so we got lucky before and didn't introduce a bug.
Still, we were playing with fire. This change adds an assertion that the
HardState is never empty in these cases so that we can eliminate the
incorrect handling for this case and restrict our logic.

Release note: None
nvanbenschoten added a commit to nvanbenschoten/cockroach that referenced this issue Apr 17, 2018
Fixes cockroachdb#24854.

Before this change, we were incorrectly handling cases where a non-empty
Raft Snapshot included an empty HardState. Luckily, this case doesn't
seem to be possible so we got lucky before and didn't introduce a bug.
Still, we were playing with fire. This change adds an assertion that the
HardState is never empty in these cases so that we can eliminate the
incorrect handling for this case and restrict our logic.

Release note: None
craig bot pushed a commit that referenced this issue Apr 17, 2018
24879: storage: correctly handle empty HardState during Raft Snapshot r=nvanbenschoten a=nvanbenschoten

Fixes #24854.

Before this change, we were incorrectly handling cases where a non-empty
Raft Snapshot included an empty HardState. Luckily, this case doesn't
seem to be possible so we got lucky before and didn't introduce a bug.
Still, we were playing with fire. This change adds an assertion that the
HardState is never empty in these cases so that we can eliminate the
incorrect handling for this case and restrict our logic.

Release note: None

Co-authored-by: Nathan VanBenschoten <[email protected]>
@craig craig bot closed this as completed in #24879 Apr 17, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants