allocator: avoid balancing around the mean #83493
Labels
A-kv-distribution
Relating to rebalancing and leasing.
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
no-issue-activity
T-kv
KV Team
X-stale
Is your feature request related to a problem? Please describe.
CRDB's allocation strategy is broadly "for an equivalent set of stores, keep '# of batch requests' within X% of the mean". It's worth exploring whether that's a sensible goal for the system to have, especially if we consider allocation across different resource dimensions (#83490). What does it mean to keep CPU use nearly identical, especially for heterogenous hardware or regions? I understand that this approach is motivated by wanting as much headroom as possible in order to absorb a burst of activity before throttling (and/or until allocation kicks in again if integrated with resource-throttling), but we ought to evaluate how effective it is for its stated goal, measured perhaps by the how much burst of activity it makes room to absorb pre-throttling in real clusters, with the cost being the number of snapshot bytes transferred to keep things in balance. We could compare this to an idealized “lazy allocator” that only moves leases/replicas around after experiencing throttling of some form.
Realistically we're always going to have some form of "keep things balanced" to create reasonable amounts of headroom, but in terms of priorities, it's a secondary concern to maximizing good resource use by avoiding throttling (#83490), and could perhaps benefit from a separate implementation entirely given it's different goals.
Additional context
This is a (partially) speculative issue, one that we should engage with if/when we reconsidering the signals we use to allocate.
Jira issue: CRDB-17100
The text was updated successfully, but these errors were encountered: