-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kvserver: garbage collect uninitialized replica eventually #47982
kvserver: garbage collect uninitialized replica eventually #47982
Conversation
❌ The GitHub CI (Cockroach) build has failed on 695bafb0. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan. |
I was wrong, this is actually a problem. We never process uninitialized replicas and if we did we'd panic. The only way they end up getting removed is through errors coming back from raft. I'll put up a PR for backport to 20.1. |
695bafb
to
6e98861
Compare
❌ The GitHub CI (Cockroach) build has failed on 6e98861c. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan. |
❌ The GitHub CI (Cockroach) build has failed on 6e98861c. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan. |
6e98861
to
2a6b7f4
Compare
@irfansharif this feels like the sort of random KV PR you'd have interest in, want to review it? cc @andreimatei and @nvanbenschoten with whom I discussed this in different settings. |
❌ The GitHub CI (Cockroach) build has failed on 2a6b7f43. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan. |
We recently saw an abandoned "leaked" uninitialized replica on a 19.2 node. The unfortunate truth is in that version we aren't sure of its replica ID and thus can't remove it. In 20.1 and later we can remove uninitialized replicas (see cockroachdb#44615). Much of the time, uninitialized replicas are removed either because the range is re-added to the store at a higher replica ID or because the node receives a ReplicaTooOldError which leads to the removal. The replicaGCQueue is the obvious tool to remove "leaked" uninitialized replicas. This commit has to do some work to allow that queue the opportunity to interact with uninitialized replicas. There are at least two cases where such replicas can certainly come to exist The first is when a split is rapidly followed by a removal of both the LHS and RHS but the LHS never processes the split because it receives a ReplicaTooOldError. The second is when a snapshot to add a learner fails but the learner never discovers that it has been removed. Release note (bug fix): Garbage collect abandoned, uninitialized replicas.
2a6b7f4
to
ef57060
Compare
❌ The GitHub CI (Cockroach) build has failed on ef57060a. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd ask to amend the release note to be a bit more descriptive by including the observable effects for this kind of leaking (confusing range reports and a gradual+limited memory leak?).
Reviewed 11 of 13 files at r1, 2 of 2 files at r2.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @ajwerner)
pkg/kv/kvserver/client_replica_test.go, line 3156 at r2 (raw file):
// This was historically problematic because we had no mechanism to detect and // remove these replicas, making them a slow, rare memory leak. They don't // really cause problems but they can show up in range reports which is
[nit] s/can show up/showed up, s/is/was
pkg/kv/kvserver/client_replica_test.go, line 3205 at r2 (raw file):
// Prepare the LHS to drop drop MsgApps so it does not apply the split. fmt.Println("got those voters")
Leftover fmt.Println
?
pkg/kv/kvserver/client_replica_test.go, line 3285 at r2 (raw file):
require.True(t, roachpb.IsRangeNotFoundError(err)) }) t.Run("(2) abandoned learner", func(t *testing.T) {
[nit] Add newline between the subtests.
pkg/kv/kvserver/client_replica_test.go, line 3316 at r2 (raw file):
}) replicateErrChan := make(chan error) go func() {
[nit] Maybe add just a bit of commentary in the subtest as above, makes it easier to scan through.
pkg/kv/kvserver/replica_gc_queue.go, line 52 at r2 (raw file):
// Uninitialized replicas exist while a replica is in the process of receiving // a snapshot. Uninitialized replicas can be orphaned in cases where the // snapshot fails. In some cases those uninitialized replicas will receive
The two cases outlined in the comment for TestUninitializedReplicaIsEventuallyCleanedUp
are well suited here, I think.
pkg/kv/kvserver/replica_gc_queue.go, line 258 at r2 (raw file):
// Return false immediately if the previous check was less than the check // interval in the past. Note that we don't do this if the replica is in // candidate state(suspect), in which case we want to be more aggressive.
Spacing before paren.
pkg/kv/kvserver/replica_init.go, line 99 at r2 (raw file):
// lastReplicaAddedTime is initialized to the time when the uninitialized // replica is constructed to detect when uninitialized replicas are
Incomplete comment.
pkg/kv/kvserver/replica_init.go, line 315 at r2 (raw file):
// uninitialized replica for the purpose of replica GC. r.mu.lastReplicaAdded = r.mu.replicaID r.mu.lastReplicaAddedTime = time.Time{}
Why time.Time{}
, and not timeutil.Now()
? Same question for the block below.
pkg/kv/kvserver/replica_proposal_quota.go, line 204 at r1 (raw file):
if rep.ReplicaID == r.mu.lastReplicaAdded && progress.Match >= commitIndex { if !r.isInitializedRLocked() { panic("here")
Improve the panic message?
pkg/kv/kvserver/store_remove_replica.go, line 37 at r1 (raw file):
// this method is a no-op and returns no error. This method is called from // the ReplicaGC queue and may, in rare case, race with synchronous removal // due to a response to a raft message.
It's unclear looking at this comment alone what we do (or don't) given this race condition.
Curious if we intend to move forward on this, in the context of https://github.com/cockroachlabs/support/issues/1340#issuecomment-984163724 |
There should be much less immediate need for this now that we don't unquiesce uninitialized replicas. That change will be backported to v21.2 in a week or two. I want it to bake a bit longer on master after it uncovered a bug in a different part of the system that requires a pair of additional backports. We'll be tracking the full cleanup of uninitialized replicas in #73424 going forward. |
We recently saw an abandoned "leaked" uninitialized replica on a 19.2 node.
The unfortunate truth is in that version we aren't sure of its replica ID and
thus can't remove it. In 20.1 and later we can remove uninitialized replicas
(see #44615). Much of the time, uninitialized replicas are removed either
because the range is re-added to the store at a higher replica ID or because
the node receives a ReplicaTooOldError which leads to the removal.
The replicaGCQueue is the obvious tool to remove "leaked" uninitialized
replicas. This commit has to do some work to allow that queue the opportunity
to interact with uninitialized replicas.
There are at least two cases where such replicas can certainly come to exist
The first is when a split is rapidly followed by a removal of both the LHS and
RHS but the LHS never processes the split because it receives a
ReplicaTooOldError. The second is when a snapshot to add a learner fails but
the learner never discovers that it has been removed.
Release note (bug fix): Garbage collect abandoned, uninitialized replicas rather than leaking them until the process restarts.