kvserver: use a separate lease queue for preference enforcement #116703
Labels
A-kv-distribution
Relating to rebalancing and leasing.
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
P-3
Issues/test failures with no fix SLA
T-kv
KV Team
Is your feature request related to a problem? Please describe.
Lease preferences allow a user to control the placement of leases in preferred localities for ranges. Lease preference repair is currently carried out by the replicate queue as an implicit action under
AllocatorConsiderRebalance
, when no rebalance options are found.The replicate queue processes at most one range at a time and blocks on the completion of its replication actions such as up(down)-replication, replacing replicas on dead/decommissioning nodes and rebalancing.
When there are higher priority replication actions (listed above) and also lease transfers required to satisfy a preference, lease preferences will occur strictly after the other replication actions. This is unfortunate, as a less preferred or violating lease can lead to unacceptable latency if the application relies on preference enforcement.
Describe the solution you'd like
Introduce a new store replica queue, which is responsible for transferring leases to satisfy the applied lease preferences. The new queue should follow roughly the same structure as other simpler store replica queues1. The steps required to complete this issue are:
replica_lease_queue
, where theshouldQueue()
method returns true if there is a transfer target with lease transfer options similar to those used in the rebalance code path2. Theprocess()
method should similarly transfer the lease if a replica is returned from the transfer target method.ShouldPlanChange
4.Describe alternatives you've considered
An alternative which would allow lease preferences to be prioritized above rebalancing, but still behind up-replication etc, would introduce a priority for lease preference repair actions e.g.,
AllocatorLeasePreferenceTransfer
. This would additionally require separating replica rebalancing from replica constraint satisfaction (#90110).Additional context
The queue would also perform lease rebalancing as a result of calling transfer lease target. The implementer should consider light weight synchronization between the replicate queue and replica lease queue, so that concurrent lease transfers and replication changes are minimized.
See also #116081
Related: #106102
Jira issue: CRDB-34730
Footnotes
https://github.com/cockroachdb/cockroach/blob/6fae6a751337e6e3c1f6432f7c31b15d0ba820ac/pkg/kv/kvserver/consistency_queue.go#L162-L162 ↩
https://github.com/cockroachdb/cockroach/blob/6fae6a751337e6e3c1f6432f7c31b15d0ba820ac/pkg/kv/kvserver/allocator/plan/replicate.go#L895-L899 ↩
https://github.com/cockroachdb/cockroach/blob/6fae6a751337e6e3c1f6432f7c31b15d0ba820ac/pkg/kv/kvserver/allocator/plan/replicate.go#L883-L913 ↩
https://github.com/cockroachdb/cockroach/blob/6fae6a751337e6e3c1f6432f7c31b15d0ba820ac/pkg/kv/kvserver/allocator/plan/replicate.go#L206-L235 ↩
https://github.com/cockroachdb/cockroach/blob/6fae6a751337e6e3c1f6432f7c31b15d0ba820ac/pkg/kv/kvserver/asim/queue/queue.go#L31-L31 ↩
The text was updated successfully, but these errors were encountered: