-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
storage: always write a HardState #7598
Conversation
Reviewed 4 of 4 files at r1. storage/replica_raftstorage.go, line 515 [r1] (raw file):
s/corresponding// - there's no verification of correspondence here AFAICT storage/replica_raftstorage.go, line 537 [r1] (raw file):
s/any// storage/replica_raftstorage.go, line 609 [r1] (raw file):
does this need to have intimate knowledge of preemptive snapshots? how about "cannot apply hard state to (some identification) which is not yet a member of any range"? ditto below. Comments from Reviewable |
Reviewed 1 of 1 files at r2. storage/store.go, line 846 [r2] (raw file):
does this want to log the truncated state that was synthesized? storage/store.go, line 858 [r2] (raw file):
does this want to log the hard state that was synthesized? storage/store.go, line 860 [r2] (raw file):
i think you forgot to write the hard state here. Comments from Reviewable |
Review status: all files reviewed at latest revision, 7 unresolved discussions, all commit checks successful. storage/replica_raftstorage.go, line 620 [r2] (raw file):
Did you mean to be calling Comments from Reviewable |
#7468 applies preemptive snapshots with a hard state. That was required to get it to work, though I wouldn't be surprised if it contained a bug. |
@tschottdorf: @cuongdo is going to pick up this PR. |
@cuongdo note that the code here doesn't actually write the HardState (in applySnapshot). I don't think this is a high priority change at the moment (except perhaps for the migration), and we'll have to see where we end up wrt maintaining Raft state in general. |
OK, let me know if I can help, e.g. testing the migration stuff with the On Wed, Jul 6, 2016 at 8:23 AM Tobias Schottdorf [email protected]
|
PTAL, I updated/cleaned up the commits. Light on testing, but I wanted @bdarnell to take a look first. The migration could be a separate PR, but it uses the previous work, so I left it in. That, too, is untested (a prime candidate for testing would be the registration cluster's backup). |
Review status: 0 of 6 files reviewed at latest revision, 8 unresolved discussions, some commit checks failed. storage/replica_raftstorage.go, line 517 [r9] (raw file):
Is it guaranteed that raft always gives us a new HardState when we apply a snapshot? I guess it would always need to advance Commit, but this seems like a delicate thing to rely on. I don't think we should be as prescriptive here and just say that if HardState is non-empty it will be written atomically with the snapshot. Synthesizing a new HardState (when needed for preemptive snapshots) should happen at a higher level. Comments from Reviewable |
Review status: 0 of 6 files reviewed at latest revision, 8 unresolved discussions, some commit checks failed. storage/replica_raftstorage.go, line 517 [r9] (raw file):
|
Review status: 0 of 6 files reviewed at latest revision, 8 unresolved discussions, some commit checks failed. storage/replica_raftstorage.go, line 517 [r9] (raw file):
|
Review status: 0 of 6 files reviewed at latest revision, 8 unresolved discussions, some commit checks failed. storage/replica_raftstorage.go, line 517 [r9] (raw file):
|
8f6d617
to
44ef8f2
Compare
PTAL and if it looks good I'll add some testing. Review status: 0 of 6 files reviewed at latest revision, 8 unresolved discussions, some commit checks pending. storage/replica_raftstorage.go, line 515 [r1] (raw file):
|
As I mention below, Review status: 0 of 6 files reviewed at latest revision, 13 unresolved discussions, all commit checks successful. storage/replica_raftstorage.go, line 629 [r12] (raw file):
This error should include the hardstate via storage/replica_raftstorage.go, line 639 [r12] (raw file):
Isn't there a 3rd condition here: storage/replica_state.go, line 325 [r12] (raw file):
storage/replica_state.go, line 344 [r12] (raw file):
This method deserves an exhaustive test of all the possible error conditions and ways in which an existing hard state can be updated. storage/store.go, line 862 [r12] (raw file):
Comments from Reviewable |
Review status: 0 of 7 files reviewed at latest revision, 13 unresolved discussions, some commit checks pending. storage/replica_raftstorage.go, line 629 [r12] (raw file):
|
Reviewed 1 of 6 files at r10, 5 of 6 files at r20, 1 of 2 files at r26, 2 of 2 files at r27. storage/replica_raftstorage.go, line 526 [r27] (raw file):
s/it changes/the commit index changes/ storage/replica_state.go, line 324 [r27] (raw file):
Comment that this must be called after s.TruncatedState has been updated. storage/replica_state.go, line 328 [r27] (raw file):
s/term/term, vote,/ storage/replica_state.go, line 334 [r27] (raw file):
Link to #7619. Your comment there said that this PR fixes that issue, but I don't see where we actually reject the snapshot. I think this comment is going into too much detail. The salient points are that if there is an existing HardState, we must respect it, and we must not apply a snapshot that would move us backwards. storage/replica_state.go, line 337 [r27] (raw file):
What does "compatible with the snapshot" mean? storage/replica_state.go, line 357 [r27] (raw file):
Is this intended to be the solution to #7619? It's not enough - we can't apply a snapshot that would cause us to discard acknowledged but uncommitted log entries. storage/replica_state.go, line 366 [r27] (raw file):
s/voted/voted in this term/ Comments from Reviewable |
Review status: 4 of 9 files reviewed at latest revision, 16 unresolved discussions, some commit checks pending. storage/replica_raftstorage.go, line 526 [r27] (raw file):
|
Review status: 4 of 9 files reviewed at latest revision, 10 unresolved discussions, all commit checks successful. storage/replica_raftstorage.go, line 648 [r28] (raw file):
I was thinking that these checks would belong before the call to storage/replica_raftstorage.go, line 651 [r28] (raw file):
I can't come up with a way this could happen with a preemptive snapshot (in the scenario described here, the second snapshot would not have replica ID 0). etcd/raft doesn't appear to have a check like this. I think they assume that snapshots only transfer the state machine and don't touch the logs. They do, however, drop any snapshots from older terms. You say below that we need to allow snapshots from old terms. Why? Comments from Reviewable |
The testing of the Review status: 4 of 9 files reviewed at latest revision, 10 unresolved discussions, all commit checks successful. Comments from Reviewable |
Reviewed 1 of 6 files at r10, 8 of 8 files at r19, 3 of 6 files at r20, 2 of 2 files at r27, 5 of 5 files at r28, 1 of 1 files at r29, 2 of 2 files at r30, 2 of 2 files at r31. storage/migration_test.go, line 58 [r30] (raw file):
comparable directly storage/replica_raftstorage.go, line 583 [r28] (raw file):
nit: this will read storage/replica_raftstorage.go, line 642 [r28] (raw file):
throughout: taking a pointer shouldn't be necessary for %+v storage/replica_raftstorage.go, line 656 [r28] (raw file):
what storage/replica_state_test.go, line 73 [r28] (raw file):
no need for the nil check storage/replica_state_test.go, line 84 [r28] (raw file):
can't these be compared directly? Comments from Reviewable |
Which Review status: all files reviewed at latest revision, 10 unresolved discussions, all commit checks successful. storage/replica_raftstorage.go, line 651 [r28] (raw file):
|
Review status: all files reviewed at latest revision, 10 unresolved discussions, all commit checks successful. storage/replica_raftstorage.go, line 651 [r28] (raw file):
|
I was referring to Review status: all files reviewed at latest revision, 8 unresolved discussions, all commit checks successful. Comments from Reviewable |
Yeah, that's a good point. I filed #7737 about that. @bdarnell waiting for your comments (and/or LGTM). Review status: 0 of 9 files reviewed at latest revision, 8 unresolved discussions, some commit checks pending. storage/migration_test.go, line 58 [r30] (raw file):
|
Reviewed 6 of 7 files at r33, 1 of 2 files at r35, 2 of 2 files at r36. storage/replica_raftstorage.go, line 642 [r28] (raw file):
|
Review status: all files reviewed at latest revision, 3 unresolved discussions, all commit checks successful. storage/replica_raftstorage.go, line 642 [r28] (raw file):
|
Review status: all files reviewed at latest revision, 3 unresolved discussions, all commit checks successful. storage/replica_raftstorage.go, line 642 [r28] (raw file):
|
Review status: all files reviewed at latest revision, 2 unresolved discussions, all commit checks successful. storage/replica_raftstorage.go, line 651 [r28] (raw file):
|
As discovered in cockroachdb#6991 (comment), it's possible that we apply a Raft snapshot without writing a corresponding HardState since we write the snapshot in its own batch first and only then write a HardState. If that happens, the server is going to panic on restart: It will have a nontrivial first index, but a committed index of zero (from the empty HardState). This change prevents us from applying a snapshot when there is no HardState supplied along with it, except when applying a preemptive snapshot (in which case we synthesize a HardState). Ensure that the new HardState and Raft log does not break promises made by an existing one during preemptive snapshot application. Fixes cockroachdb#7619. storage: prevent loss of uncommitted log entries
See cockroachdb#6991. It's possible that the HardState is missing after a snapshot was applied (so there is a TruncatedState). In this case, synthesize a HardState (simply setting everything that was in the snapshot to committed). Having lost the original HardState can theoretically mean that the replica was further ahead or had voted, and so there's no guarantee that this will be correct. But it will be correct in the majority of cases, and some state *has* to be recovered. To illustrate this in the scenario in cockroachdb#6991: There, we (presumably) have applied an empty snapshot (no real data, but a Raft log which starts and ends at index ten as designated by its TruncatedState). We don't have a HardState, so Raft will crash because its Commit index zero isn't in line with the fact that our Raft log starts only at index ten. The migration sees that there is a TruncatedState, but no HardState. It will synthesize a HardState with Commit:10 (and the corresponding Term from the TruncatedState, which is five).
Review status: all files reviewed at latest revision, 2 unresolved discussions, all commit checks successful. storage/replica_raftstorage.go, line 651 [r28] (raw file):
|
Review status: 0 of 11 files reviewed at latest revision, 2 unresolved discussions, all commit checks successful. Comments from Reviewable |
This change is