-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kvserver: incorrect DistSender leaseholder processing for non-VOTER
s
#85060
Comments
cc @cockroachdb/replication |
ReplicaDescriptor.Type
breaks equality comparisonsReplicaDescriptor.Type
pointer breaks equality comparisons
There's an additional issue with the DistSender here in that the transport will keep the initial range descriptor across all retries, regardless of whether we update the range descriptor in the meanwhile. This means that it will be vulnerable to changes in a replica's type during these retries regardless of the type change as well. I'll keep this issue to cover both. |
ReplicaDescriptor.Type
pointer breaks equality comparisonsVOTER
s
@erikgrinaker I'm currently working on addressing exactly that with my patch for #75742 which is caused by what you describe. My current thinking is that we should recognize if the routing (range desc, lease) has been updated to a point where it's no longer compatible with the transport, in which case, we should bail early and just retry. |
Cool! I have a minor fix for the immediate problem here as well in #85140, which ignores the replica type when comparing replicas. What you're saying makes sense. I suppose the alternative would be to update the transport on-the-fly, but I don't know if we really gain much performance-wise over just bailing and retrying. What do you mean by compatible? I imagine it will bail if the given leaseholder isn't in the range descriptor. What about a partially matching range descriptor? Will it bail immediately, or keep trying the ones they have in common? Does it care about replica types? I suppose the risk if it was too sensitive would be that it could start flapping between two replicas with very different views of the range descriptor. Although the generation would prevent that, I guess. |
We recently started shipping back range descriptors inside of
I was imagining that we'd bail if the leaseholder wasn't part of the transport. Even though the routing can have stale lease information/range descriptor we never expect it to regress. So, if the routing has been updated such that the leaseholder is not part of the transport, I don't think we get much by exhausting the transport before retrying. We should be able to bail early and retry with a new transport constructed with a fresh(er) range descriptor.
Yup, things mostly work as expected with the lease sequences/range descriptor generation to make sure our cache never regresses. The only subtlety I found was around speculative leases -- whenever there's a speculative lease coming from a replica that has an older view of the range descriptor, we simply disregard it. I'll add you to the PR as well once it's out. |
Sounds good, thanks! |
84420: sql/logictests: don't fail parent test when using retry r=knz a=stevendanna testutils.SucceedsSoon calls Fatal() on the passed testing.T. Here, we were calling SucceedsSoon with the root T, which in the case of a subtest resulted in this error showing up in our logs testing.go:1169: test executed panic(nil) or runtime.Goexit: subtest may have called FailNow on a parent test This moves to using SucceedsSoonError so that we can process errors using the same formatting that we use elsewhere. Release note: None 85120: roachpb: make range/replica descriptor fields non-nullable r=pavelkalinnikov a=erikgrinaker This patch makes all fields in range/replica descriptors non-nullable, fixing a long-standing TODO. Additionally, this fixes a set of bugs where code would use regular comparison operators (e.g. `==`) to compare replica descriptors, which with nullable pointer fields would compare the memory locations rather than the values. One practical consequence of this was that the DistSender would fail to use a new leaseholder with a non-`VOTER` type (e.g. `VOTER_INCOMING`), instead continuing to try other replicas before backing off. However, there are further issues surrounding this bug and it will be addressed separately in a way that is more amenable to backporting. The preparatory work for this was done in ea720e3. Touches #85060. Touches #38308. Touches #38465. Release note: None 85352: opt: revert data race fix r=mgartner a=mgartner This commit reverts #37972. We no longer lazily build filter props and share them across multiple threads. Release note: None Co-authored-by: Steven Danna <[email protected]> Co-authored-by: Erik Grinaker <[email protected]> Co-authored-by: Marcus Gartner <[email protected]>
ReplicaDescriptor.Type
is a pointer rather than a scalar value:cockroach/pkg/roachpb/metadata.proto
Line 151 in aee773a
We often use regular equality comparisons on replica descriptors, notably in the DistSender and gRPC transport:
cockroach/pkg/kv/kvclient/kvcoord/dist_sender.go
Line 2209 in f77e393
cockroach/pkg/kv/kvclient/kvcoord/transport.go
Line 258 in 9ad06da
But recall that equality for pointers will check that the memory address is the same, rather than that the value is the same. Thus two identical replica descriptors with a type different from
VOTER
(the zero value) will be considered unequal if they are stored in different memory locations.One immediate effect of this is that the DistSender will not prioritize new leaseholders with a
VOTER_INCOMING
type. In #74546,VOTER_INCOMING
was allowed to receive the lease and thus included in the DistSender's replica list with theRoutingPolicy_LEASEHOLDER
policy:cockroach/pkg/kv/kvclient/kvcoord/dist_sender.go
Lines 1932 to 1943 in f77e393
However, when it receives the NLHE it attempts to move the new leaseholder to the front of the transport list:
cockroach/pkg/kv/kvclient/kvcoord/dist_sender.go
Lines 2201 to 2212 in f77e393
But this won't happen because the non-
nil
Type
doesn't satisfy the equality check inMoveToFront
:cockroach/pkg/kv/kvclient/kvcoord/transport.go
Line 258 in 9ad06da
Jira issue: CRDB-18022
The text was updated successfully, but these errors were encountered: