-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
release-22.1: kvcoord: optionally update range cache with descriptor returned in NLHE #98798
release-22.1: kvcoord: optionally update range cache with descriptor returned in NLHE #98798
Conversation
Previously, this test was constructing a NLHE with a structure we wouldn't really expect. We instead move to running two versions of this test -- one with a speculative lease and another where the lease returned has an older lease sequence than what is cached. Release note: None Release justification: Non-production code change
Previously, we were logging that we had evicted a range descriptor from the cache even though we may not have -- we only evict if the cache doesn't have a newer descriptor than the one we're trying to evict. We now only log if we indeed evicted the range descriptor; otherwise, we do not. Release note: None Release justification: Low risk, high benefit change
We recently started shipping the range descriptors back on NotLeaseHolderError. Prior to this patch, we only ever used their generation to elide certain lease updates if they originated from a replica that had an older (stale) view of this world. This patch goes a step further in making use of the returned range descriptor -- we now update the client's range cache if the returned range descriptor is newer than what existed on the client. We do this by picking the freshest range descriptor and lease, independently, from the range descriptor/lease what exist in the client's range cache and what was returned in the NLHE. Though unlikely, independently choosing the freshest lease/range descriptor can lead to cases where the lease is not compatible with the range descriptor. If we detect this to be the case, we empty out the lease and simply cache the freshest range descriptor. Previously, we always accepted speculative leases to be more recent than anything already in the cache. Now, we discard speculative leases if they're coming from a replica that has an older view of the world. Morally, these semantics are the right way to conceptualize cockroachdb#72772, given that hazard could only ever exist for speculative leases. cockroachdb#82802 also falls out as a special case of this conceptualization, and this patch adds a regression test for the scenario where an uninitialized replica returns a NLHE with a speculative lease pointing to a replica that isn't part of the range. We also get rid of optimizations where we would try to identify "stale" range descriptors when updating lease information. These no longer make sense given we don't just update the lease, we also update the range descriptor. These optimizations also didn't account for all possible cases, such as when the leaseholder was present on the range descriptor, but as a LEARNER. See cockroachdb#75742 for more details about how this hazard manifests. With this new approach we also address cockroachdb#75742. Instead of invalidating the routing when a descriptor is identified as stale and using that as a proxy to bail early when routing to replicas, we instead switch to a more direct approach -- if at any point we detect the leaseholder isn't included on the transport, we exit the routing logic early, and retry at a layer above. Given our new range cache update semantics using descriptors on NLHE errors, it's quite likely this retry circumvents a range descriptor lookup from meta2 -- instead, we'd expect this to simply amount to trying again with a newly constructed transport. The rationale is that even though the client may have an arbitrarily stale view of the range, we never expect it to regress. Thus, if the leaseholder was updated on the client and it doesn't exist on the transport, it must be the case that the transport was constructed using a stale range descriptor and stale leaseholder information. As such, there is likely not much value in trying to exhaust the transport (and potentially getting caught in backoff loops). We're much better served by bailing early and trying again with a fresh transport. Fixes cockroachdb#82802 Fixes cockroachdb#75742 Release note: None Release justification: Fixes for high priority bugs in existing functionality.
Previously, if a request was routed to an uninitialized replica, the returned NLHE would contain a speculative lease pointing to the creator of the uninitialized replica. The prior commit makes it such that lease/leaseholder information in NLHEs is only considered if accompanied by a non-stale range descriptor. As uninitialized replicas always return empty range descriptors, which are considered stale, the speculative lease they returned would never be acted upon. Given this, we omit speculative leases in NLHEs returned by uninitialized replicas. The idea being the client will simply move on to the next replica in its transport which will very likely have more useful leaseholder information for the client. Release note: None Release justification: Low risk, high benefit change
Thanks for opening a backport. Please check the backport criteria before merging:
If some of the basic criteria cannot be satisfied, ensure that the exceptional criteria are satisfied within.
Add a brief release justification to the body of your PR to justify this backport. Some other things to consider:
|
We were logging `lease holder unknown` when the deprecated field was not populated. Release note: None
v22.1 binaries assume that the leaseholder is unknown when logging NLHE errors if the (Deprecated)LeaseHolder field is unset -- regardless of if the Lease is set or not. We broke this logging in 0402f47 (for mixed version clusters) when we stopped shipping back leaseholder information (in favour of only shipping lease information) on NLHEs. This patch fixes this by populating the (Deprecated)LeaseHolder field when constructing NLHEs. Release note: None
Ping, this possibly showed up here: https://github.com/cockroachlabs/support/issues/2239. Are you planning to merge this backport? |
I think I'm going to close this backport, given 22.1 is nearing its end. This isn't a trivial backport, so landing it doesn't feel worth the risk. |
Backport 4/4 commits from #85503.
/cc @cockroachdb/release
We recently started shipping the range descriptors back on
NotLeaseHolderError. Prior to this patch, we only ever used their
generation to elide certain lease updates if they originated from a
replica that had an older (stale) view of this world.
This patch goes a step further in making use of the returned range
descriptor -- we now update the client's range cache if the returned
range descriptor is newer than what existed on the client. We do this
by picking the freshest range descriptor and lease, independently, from
the range descriptor/lease between what exists in the client's range
cache and what was returned in the NLHE. Though unlikely, independently
choosing the freshest lease/range descriptor can lead to cases where the
lease is not compatible with the range descriptor. If we detect this to
be the case, we empty out the lease and simply cache the freshest range
descriptor.
Previously, we always accepted speculative leases to be more recent than
anything already in the cache. Now, we discard speculative leases if
they're coming from a replica that has an older view of the world.
Morally, these semantics are the right way to conceptualize #72772,
given that hazard could only ever exist for speculative leases. #82802
also falls out as a special case of this conceptualization, so closing
that issue simply entails adding a regression test.
We also get rid of optimizations where we would try to identify "stale"
range descriptors when updating lease information. These no longer make
sense given we don't just update the lease, we also update the range
descriptor. These optimizations also didn't account for all possible
cases, such as when the leaseholder was present on the range
descriptor, but as a LEARNER. See #75742 for more details about how
this hazard manifests.
With this new approach we also address #82802. Instead of invalidating
the routing when a descriptor is identified as stale and using that as
a proxy to bail early when routing to replicas, we instead switch to a
more direct approach -- if at any point we detect the leaseholder isn't
included on the transport, we exit the routing logic early, and retry
at a layer above. Given our new range cache update semantics using
descriptors on NLHE errors, it's quite likely this retry circumvents a
range descriptor lookup -- instead, we'd expect this to simply amount to
trying again with a newly constructed transport. The rationale is that
even though the client may have an arbitrarily stale view of the range,
we never expect it to regress. Thus, if the leaseholder was updated on
the client and it doesn't exist on the transport, it must be the case
that the transport was constructed using a stale range descriptor and
stale leaseholder information. As such, there is likely not much value
in trying to exhaust the transport (and potentially getting caught in
backoff loops). We're much better served by bailing early and trying
again with a fresh transport.
Fixes #82802
Fixes #75742
Release note: None
Release justification: Fixes for high priority bugs in existing
functionality.