-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kvserver: sep-raft-log: add crash-testing datadriven test harness #93247
Labels
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
T-kv
KV Team
Comments
tbg
added
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
T-kv-replication
labels
Dec 8, 2022
cc @cockroachdb/replication |
tbg
added a commit
to tbg/cockroach
that referenced
this issue
Dec 19, 2022
Mechanical using GoLand. Next steps are starting to add a datadriven testing framework so that functionality can be exercised when it moves or is added here, i.e. cockroachdb#93247. Epic: none Release note: None
tbg
added a commit
to tbg/cockroach
that referenced
this issue
Dec 22, 2022
Mechanical using GoLand. Next steps are starting to add a datadriven testing framework so that functionality can be exercised when it moves or is added here, i.e. cockroachdb#93247. Epic: none Release note: None
craig bot
pushed a commit
that referenced
this issue
Dec 22, 2022
94131: kvserver: move LoadAndReconcileReplicas to kvstorage r=pavelkalinnikov a=tbg Mechanical using GoLand. Next steps are starting to add a datadriven testing framework so that functionality can be exercised when it moves or is added here, i.e. #93247. Epic: none Release note: None Co-authored-by: Tobias Grieger <[email protected]>
tbg
added a commit
to tbg/cockroach
that referenced
this issue
Jan 23, 2023
See cockroachdb#93310. This is also the beginning of cockroachdb#93247. Epic: CRDB-220 Release note: None
tbg
added a commit
to tbg/cockroach
that referenced
this issue
Feb 2, 2023
See cockroachdb#93310. This is also the beginning of cockroachdb#93247. Epic: CRDB-220 Release note: None
tbg
added a commit
to tbg/cockroach
that referenced
this issue
Feb 3, 2023
See cockroachdb#93310. This is also the beginning of cockroachdb#93247. Epic: CRDB-220 Release note: None
craig bot
pushed a commit
that referenced
this issue
Feb 6, 2023
95513: kvstorage: complete RaftReplicaID migration r=pavelkalinnikov a=tbg As of v22.1[^1], we always write the RaftReplicaID when creating a Replica or updating a snapshot. However, since this is persisted state that could've originated in older versions and not updated yet, we couldn't rely on a persisted ReplicaID yet. This commit adds code to the `(*Store).Start` boot sequence that - persists a RaftReplicaID for all initialized replicas (using the ReplicaID from the descriptor) - deletes all uninitialized replicas lacking RaftReplicaID (since we don't know their ReplicaID at this point). The second item in theory violates Raft invariants, as uninitialized Replicas are allowed to vote (though they then cannot accept log entries). So in theory: - an uninitialized replica casts a decisive vote for a leader - it restarts - code in this commit removes the uninited replica (and its vote) - delayed MsgVote from another leader arrives - it casts another vote for the same term for a dueling leader - now there are two leaders in the same term. The above in addition presupposes that the two leaders cannot communicate with each other. Also, even if that is the case, since the two leaders cannot append to the uninitialized replica (it doesn't accept entries), we also need additional voters to return at the exact right time. Since an uninitialized replica without RaftReplicaID in is necessarily at least one release old, this is exceedingly unlikely and we will live with this theoretical risk. This PR also adds a first stab at a datadriven test harness for `kvstorage` which is likely to be of use for #93247. [^1]: #75761 Epic: [CRDB-220](https://cockroachlabs.atlassian.net/browse/CRDB-220) Release note: None Co-authored-by: Tobias Grieger <[email protected]>
We have a basic datadriven framework at https://github.com/cockroachdb/cockroach/blob/672e8b1e628ef0a4c920cf42355c44cbc7a625b6/pkg/kv/kvserver/kvstorage/datadriven_test.go that should, with extra work, be able to accommodate our testing needs. The testing it is used for is limited. I'm adding some bullets to the comment above about missing areas. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
T-kv
KV Team
We need to exhaustively test the invariants for the engine states in separate raft logs.
We'll add assertions but should also explore these using datadriven tests.
Some of the code in #88606 might be helpful.
TODO:
I'll add that our testing of the snapshot SSTs is currently weak. We rely on spinning up a real cluster and then intercept the SSTs, which is not very maintainable. We should complement this with
kvstorage
-level tests that create a snapshot and examine in detail their contents and the engine each one targets.Epic: CRDB-220
Jira issue: CRDB-22240
The text was updated successfully, but these errors were encountered: