Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

allocator: avoid balancing around the mean #83493

Closed
irfansharif opened this issue Jun 28, 2022 · 1 comment
Closed

allocator: avoid balancing around the mean #83493

irfansharif opened this issue Jun 28, 2022 · 1 comment
Labels
A-kv-distribution Relating to rebalancing and leasing. C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) no-issue-activity T-kv KV Team X-stale

Comments

@irfansharif
Copy link
Contributor

irfansharif commented Jun 28, 2022

Is your feature request related to a problem? Please describe.

CRDB's allocation strategy is broadly "for an equivalent set of stores, keep '# of batch requests' within X% of the mean". It's worth exploring whether that's a sensible goal for the system to have, especially if we consider allocation across different resource dimensions (#83490). What does it mean to keep CPU use nearly identical, especially for heterogenous hardware or regions? I understand that this approach is motivated by wanting as much headroom as possible in order to absorb a burst of activity before throttling (and/or until allocation kicks in again if integrated with resource-throttling), but we ought to evaluate how effective it is for its stated goal, measured perhaps by the how much burst of activity it makes room to absorb pre-throttling in real clusters, with the cost being the number of snapshot bytes transferred to keep things in balance. We could compare this to an idealized “lazy allocator” that only moves leases/replicas around after experiencing throttling of some form.

Realistically we're always going to have some form of "keep things balanced" to create reasonable amounts of headroom, but in terms of priorities, it's a secondary concern to maximizing good resource use by avoiding throttling (#83490), and could perhaps benefit from a separate implementation entirely given it's different goals.

Additional context

This is a (partially) speculative issue, one that we should engage with if/when we reconsidering the signals we use to allocate.

Jira issue: CRDB-17100

@irfansharif irfansharif added C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) A-kv-distribution Relating to rebalancing and leasing. labels Jun 28, 2022
@blathers-crl blathers-crl bot added the T-kv KV Team label Jun 28, 2022
Copy link

We have marked this issue as stale because it has been inactive for
18 months. If this issue is still relevant, removing the stale label
or adding a comment will keep it active. Otherwise, we'll close it in
10 days to keep the issue queue tidy. Thank you for your contribution
to CockroachDB!

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Feb 26, 2024
@github-project-automation github-project-automation bot moved this to Closed in KV Aug 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-kv-distribution Relating to rebalancing and leasing. C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) no-issue-activity T-kv KV Team X-stale
Projects
No open projects
Status: Closed
Development

No branches or pull requests

1 participant