-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
storage: can a Snapshot remove the Raft HardState permanently? #24854
Comments
Looks like we dodged a bullet? I agree that it looks like we're handling the empty HardState incorrectly and that Raft wouldn't ask us to apply such a snapshot. I'm pretty sure that when I wrote this comment I was only afraid that Raft might emit such a HardState, but not sure that it could. What do you suggest we do? Just reject snapshots with an empty HardState as they shouldn't exist? |
Yes, that's what I was thinking. Adding in logic that won't be exercised seems like a generally bad idea. |
👍 |
Fixes cockroachdb#24854. Before this change, we were incorrectly handling cases where a non-empty Raft Snapshot included an empty HardState. Luckily, this case doesn't seem to be possible so we got lucky before and didn't introduce a bug. Still, we were playing with fire. This change adds an assertion that the HardState is never empty in these cases so that we can eliminate the incorrect handling for this case and restrict our logic. Release note: None
Fixes cockroachdb#24854. Before this change, we were incorrectly handling cases where a non-empty Raft Snapshot included an empty HardState. Luckily, this case doesn't seem to be possible so we got lucky before and didn't introduce a bug. Still, we were playing with fire. This change adds an assertion that the HardState is never empty in these cases so that we can eliminate the incorrect handling for this case and restrict our logic. Release note: None
24879: storage: correctly handle empty HardState during Raft Snapshot r=nvanbenschoten a=nvanbenschoten Fixes #24854. Before this change, we were incorrectly handling cases where a non-empty Raft Snapshot included an empty HardState. Luckily, this case doesn't seem to be possible so we got lucky before and didn't introduce a bug. Still, we were playing with fire. This change adds an assertion that the HardState is never empty in these cases so that we can eliminate the incorrect handling for this case and restrict our logic. Release note: None Co-authored-by: Nathan VanBenschoten <[email protected]>
replica.applySnapshot
claims that the HardState parameter it takes:If this is truly the case then I'm having trouble convincing myself that we don't have a bug.
applySnapshot
first deletes all replicated and unreplicated keys. It then writes the HardState, but only if the HardState provided by Raft is not empty. How does this work if the HardState provided by Raft is empty? I can't see (and my own testing hasn't found) how a HardState would be written, so it looks like we'll end up with a range without a HardState. This seems really bad, and basic testing has shown that it can lead to this assertion firing (applied(%d) is out of range
, nottocommit(%d) is out of range
).The solution to this, if I'm not missing anything, would be to save the HardState before we clear it and write this old HardState again after the
clearRange
if Raft doesn't provide a new one. However, I'm not even convinced it's possible to raft to provide an empty HardState. This is because this check look like it should prevent a snapshot that doesn't also increase thecommit
index. I added in an assertion to test this and our entire test suite passes without it firing.I ran into this question while performing some nearby refactoring. I began digging into the git history before finding that there's been a long history around this so I'd like to get input from those who are more knowledgeable about it before trying to make any changes myself.
cc. @bdarnell (who I already spoke to) @tschottdorf @petermattis
The text was updated successfully, but these errors were encountered: