Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

admission: prioritization thresholds for workloads #84496

Closed
lunevalex opened this issue Jul 15, 2022 · 2 comments
Closed

admission: prioritization thresholds for workloads #84496

lunevalex opened this issue Jul 15, 2022 · 2 comments
Labels
A-admission-control C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) T-kv KV Team

Comments

@lunevalex
Copy link
Collaborator

lunevalex commented Jul 15, 2022

Is your feature request related to a problem? Please describe.
Today Admission Control has a single threshold for throttling requests regardless of priority. This means that if you kick off a low priority job that pushes the store over the current threshold (i.e. number of files/sub-levels in L0) all requests may get throttled. The queue will prioritize high priority work first, but depending on the distribution of work it is still possible for high priority work to get throttled.

Describe the solution you'd like
We should consider introducing different thresholds for throttling high/low priority work. For example low priority work should start being throttled once the store hits 80% of the threshold, which should help prevent reaching 100% when foreground application traffic (i.e. high priority work) begins being throttled. The number 80% is a random suggestion and is no backed by experimentation. We should first ascertain the efficacy of this proposal through an experiment and use that to determine the appropriate thresholds that would result in predictable latencies for foreground traffic.

Additional context
Related to #82114

Jira issue: CRDB-17692

@lunevalex lunevalex added C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) A-admission-control labels Jul 15, 2022
@blathers-crl blathers-crl bot added the T-kv KV Team label Jul 15, 2022
@sumeerbhola
Copy link
Collaborator

If the resource exerting backpressure (resulting in queuing in admission.WorkQueue) is "healthy" one should not need to have different thresholds for different kinds of work. The queuing should handle everything e.g. the low priority work will simply queue behind the higher priority work and only get the spare resources (I'm using priority as a simplifying concept here -- the actual orderings supported are more sophisticated). This separation of concerns is the point of the current design otherwise one has to start plumbing resource knowledge into the queueing.

There are 2 caveats to this in the admission control LSM shape case:

  • the token bucket is refilled every 250ms, so when the low priority work consumes the spare resources (during the instant that there was no high priority work waiting), the bucket is empty and won't refill until the next refill tick -- this means subsequently arriving high priority work will need to wait and can see latency up to 250ms. This is discussed in admission: graceful degradation #82114 (comment)
  • the notion of "healthy" is 20 sublevels. At 20 sublevels there will be some slowdown (how much depends on how much of the walltime of the read is spent in the merging iterator heap), so high priority work will also suffer latency increase and the CPU consumption will increase.

The latter would be the reason to introduce a different threshold.

@sumeerbhola
Copy link
Collaborator

We have elastic CPU which throttles early for low prority work. For the store, we also now throttle starting at 2 sub-levels and stabilize at 4 sub-levels. Marking this as done.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-admission-control C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) T-kv KV Team
Projects
No open projects
Archived in project
Development

No branches or pull requests

2 participants