-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
storage: eagerly GC replicas on raft transport errors #8172
storage: eagerly GC replicas on raft transport errors #8172
Conversation
Review status: 0 of 1 files reviewed at latest revision, 1 unresolved discussion, some commit checks pending. storage/replica.go, line 345 [r1] (raw file):
I've verified that this works, but it is fugly. Also, need to figure out how to test. Comments from Reviewable |
Test? |
@tamird Yes, a test is needed. Any advice on which existing |
15931d6
to
2c84fac
Compare
Ok, I made this less fugly. PTAL. Still need to figure out how to test. |
LGTM to the "meat" which is currently here. |
@@ -145,6 +149,10 @@ func updatesTimestampCache(r roachpb.Request) bool { | |||
return updatesTimestampCacheMethods[m] | |||
} | |||
|
|||
const replicaTooOld = "replica too old, discarding message" | |||
|
|||
var errReplicaTooOld = grpc.Errorf(codes.Aborted, replicaTooOld) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe just create those when you need them. You're not using them as a sentinel, so the fact that we use both the error and the error message is a bit confusing.
Reviewed 4 of 4 files at r2. build/check-style.sh, line 111 [r2] (raw file):
Why do we need to do this for storage/replica.go, line 152 [r2] (raw file):
Maybe say "sender replica" instead of just "replica", since I would otherwise expect error messages to refer to the recipient replica's state. storage/replica.go, line 351 [r2] (raw file):
You could define Comments from Reviewable |
2c84fac
to
b9d4c95
Compare
Added a test which didn't need additional knobs. Stress tested it and everything seems stable. Review status: 2 of 5 files reviewed at latest revision, 5 unresolved discussions, some commit checks pending. build/check-style.sh, line 111 [r2] (raw file):
|
LGTM Reviewed 3 of 3 files at r3. storage/client_raft_test.go, line 1603 [r3] (raw file):
s/ndoe/node/ storage/client_raft_test.go, line 1619 [r3] (raw file):
It would be nice if we could be more explicit that the new error check is the reason that the replica was removed. util/grpcutil/grpc_util.go, line 52 [r3] (raw file):
This isn't GPRC-specific, so this probably isn't the right place for it. Or you could switch to the Comments from Reviewable |
b9d4c95
to
fdecc0b
Compare
Review status: 3 of 5 files reviewed at latest revision, 6 unresolved discussions, some commit checks pending. storage/client_raft_test.go, line 1603 [r3] (raw file):
|
Reviewed 2 of 4 files at r2, 1 of 3 files at r3, 2 of 2 files at r4. build/check-style.sh, line 111 [r2] (raw file):
|
fdecc0b
to
ee50e31
Compare
Reviewed 2 of 2 files at r5. storage/client_raft_test.go, line 1628 [r5] (raw file):
err is guaranteed to be nil here. Comments from Reviewable |
ee50e31
to
0de0aa1
Compare
Review status: 4 of 5 files reviewed at latest revision, 14 unresolved discussions, some commit checks pending. storage/client_raft_test.go, line 1619 [r3] (raw file):
|
Reviewed 1 of 1 files at r6. storage/replica.go, line 354 [r4] (raw file):
|
0de0aa1
to
bda9196
Compare
Reviewed 1 of 1 files at r6. storage/client_raft_test.go, line 1619 [r3] (raw file):
|
bda9196
to
8d2004e
Compare
Review status: 2 of 7 files reviewed at latest revision, 5 unresolved discussions, some commit checks pending. storage/client_raft_test.go, line 1618 [r4] (raw file):
|
8d2004e
to
98d673d
Compare
Reviewed 1 of 1 files at r7, 3 of 4 files at r8. storage/replica.go, line 354 [r4] (raw file):
|
Reviewed 1 of 4 files at r8. Comments from Reviewable |
98d673d
to
cefda3b
Compare
When the Raft transport stream returns an error we can use that error as a signal that the replica may need to be GC'd. Suggested in cockroachdb#8130. Fixes cockroachdb#5789.
When the Raft transport stream returns an error we can use that error as
an signal that the replica may need to be GC'd.
Suggested in #8130.
This change is