-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kvserver: simplify replicasByKey handling #74384
Labels
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
no-issue-activity
X-stale
Comments
tbg
added
the
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
label
Jan 3, 2022
tbg
added a commit
to tbg/cockroach
that referenced
this issue
Jan 3, 2022
In cockroachdb#73721 we saw the following assertion fire: > kv/kvserver/replica_raftstorage.go:932 [n4,s4,r46/3:{-}] 1 unable to > remove placeholder: corrupted replicasByKey map: <nil> and [...] This is because `MergeRange` removes from the Store's in-memory map the right-hand side Replica before extending the left-hand side, leaving a gap for a snapshot to sneak in. A similar problem exists when a snapshot widens the existing range (i.e. the snapshot reflects the results of a merge). This commit closes both gaps. I verified the fix by calling the newly-introduced (but promptly disabled, see comment within) `(*Store).maybeAssertNoHole` from both `(*Store).MergeRange` and `applySnapshot`. The bug reproduced via this assertion, before this fix, in `TestStoreRangeMergeTimestampCache` and `TestChangeReplicasLeaveAtomicRacesWithMerge`, covering both the snapshot and merge trigger cases. I'm not too happy to merge this without the same kind of active test coverage, but since this will require a backport, it's better not to rattle the cage too much. I filed cockroachdb#74384 to clean this up fully. Release note (bug fix): A bug was fixed that, in very rare cases, could result in a node terminating with fatal error "unable to remove placeholder: corrupted replicasByKey map". To avoid potential data corruption, users affected by this crash should not restart the node, but instead decommission it in absentia and have it rejoin the cluster under a new NodeID.
tbg
added a commit
to tbg/cockroach
that referenced
this issue
Jan 4, 2022
In cockroachdb#73721 we saw the following assertion fire: > kv/kvserver/replica_raftstorage.go:932 [n4,s4,r46/3:{-}] 1 unable to > remove placeholder: corrupted replicasByKey map: <nil> and [...] This is because `MergeRange` removes from the Store's in-memory map the right-hand side Replica before extending the left-hand side, leaving a gap for a snapshot to sneak in. A similar problem exists when a snapshot widens the existing range (i.e. the snapshot reflects the results of a merge). This commit closes both gaps. I verified the fix by calling the newly-introduced (but promptly disabled, see comment within) `(*Store).maybeAssertNoHole` from both `(*Store).MergeRange` and `applySnapshot`. The bug reproduced via this assertion, before this fix, in `TestStoreRangeMergeTimestampCache` and `TestChangeReplicasLeaveAtomicRacesWithMerge`, covering both the snapshot and merge trigger cases. I'm not too happy to merge this without the same kind of active test coverage, but since this will require a backport, it's better not to rattle the cage too much. I filed cockroachdb#74384 to clean this up fully. Release note (bug fix): A bug was fixed that, in very rare cases, could result in a node terminating with fatal error "unable to remove placeholder: corrupted replicasByKey map". To avoid potential data corruption, users affected by this crash should not restart the node, but instead decommission it in absentia and have it rejoin the cluster under a new NodeID.
craig bot
pushed a commit
that referenced
this issue
Jan 4, 2022
73734: kvserver: avoid unprotected keyspace during MergeRange r=erikgrinaker a=tbg In #73721 we saw the following assertion fire: > kv/kvserver/replica_raftstorage.go:932 [n4,s4,r46/3:{-}] 1 unable to > remove placeholder: corrupted replicasByKey map: <nil> and [...] This is because `MergeRange` removes from the Store's in-memory map the right-hand side Replica before extending the left-hand side, leaving a gap for a snapshot to sneak in. A similar problem exists when a snapshot widens the existing range (i.e. the snapshot reflects the results of a merge). This commit closes both gaps. I verified the fix by calling the newly-introduced (but promptly disabled, see comment within) `(*Store).maybeAssertNoHole` from both `(*Store).MergeRange` and `applySnapshot`. The bug reproduced via this assertion, before this fix, in `TestStoreRangeMergeTimestampCache` and `TestChangeReplicasLeaveAtomicRacesWithMerge`, covering both the snapshot and merge trigger cases. I'm not too happy to merge this without the same kind of active test coverage, but since this will require a backport, it's better not to rattle the cage too much. I filed #74384 to clean this up fully. Release note (bug fix): A bug was fixed that, in very rare cases, could result in a node terminating with fatal error "unable to remove placeholder: corrupted replicasByKey map". To avoid potential data corruption, users affected by this crash should not restart the node, but instead decommission it in absentia and have it rejoin the cluster under a new NodeID. Co-authored-by: Tobias Grieger <[email protected]>
gustasva
pushed a commit
to gustasva/cockroach
that referenced
this issue
Jan 4, 2022
In cockroachdb#73721 we saw the following assertion fire: > kv/kvserver/replica_raftstorage.go:932 [n4,s4,r46/3:{-}] 1 unable to > remove placeholder: corrupted replicasByKey map: <nil> and [...] This is because `MergeRange` removes from the Store's in-memory map the right-hand side Replica before extending the left-hand side, leaving a gap for a snapshot to sneak in. A similar problem exists when a snapshot widens the existing range (i.e. the snapshot reflects the results of a merge). This commit closes both gaps. I verified the fix by calling the newly-introduced (but promptly disabled, see comment within) `(*Store).maybeAssertNoHole` from both `(*Store).MergeRange` and `applySnapshot`. The bug reproduced via this assertion, before this fix, in `TestStoreRangeMergeTimestampCache` and `TestChangeReplicasLeaveAtomicRacesWithMerge`, covering both the snapshot and merge trigger cases. I'm not too happy to merge this without the same kind of active test coverage, but since this will require a backport, it's better not to rattle the cage too much. I filed cockroachdb#74384 to clean this up fully. Release note (bug fix): A bug was fixed that, in very rare cases, could result in a node terminating with fatal error "unable to remove placeholder: corrupted replicasByKey map". To avoid potential data corruption, users affected by this crash should not restart the node, but instead decommission it in absentia and have it rejoin the cluster under a new NodeID.
We have marked this issue as stale because it has been inactive for |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
no-issue-activity
X-stale
Is your feature request related to a problem? Please describe.
In #73721 we found a bug in our replica handling that could lead to data corruption. The basic problem is that the Store maintains a mapping of keyspace to Replica/Placeholder but code interacts very directly with it and it's easy to mess it up.
In addition to this,
(*Replica).Desc()
provides the key bounds of the replica object within the btree, which is deadlock prone (and also means that(*Replica).setDescLocked
has an affect on the btree. This is all fairly cryptic and it would likely be better not to require a mutex when looking up a Replica from the tree, and to make any bounds adjustments explicit.Describe the solution you'd like
Mediate all mutations to the Store's mapping (including handling of uninitialized Replicas) through an API that enforces that Replica insertion or bounds changes go through a ReplicaPlaceholder. In particular,
r.Desc()
).Describe alternatives you've considered
Additional context
Jira issue: CRDB-12057
The text was updated successfully, but these errors were encountered: