-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
admission: fix wait queue histograms #110060
Conversation
It looks like your PR touches production code but doesn't add or edit any test code. Did you consider adding tests to your PR? 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 2 files at r1, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @bananabrick and @irfansharif)
pkg/util/admission/work_queue.go
line 639 at r1 (raw file):
if q.granter.tryGet(info.RequestedCount) { q.admitMu.Unlock() q.metrics.incAdmitted(info.Priority)
what about this case?
We previously did not record anything into {IO,CPU} wait queue histograms when work either bypassed admission control (because of the nature of the work, or when certain admission queues were disabled through cluster settings) or used the fast-path (i.e. didn't add themselves to the tenant heaps). This meant that our histogram percentiles were not accurate. NB: This problem didn't exist at the flow control level where work may not be subject to flow control depending on the mode selected ('apply_to_elastic', 'apply_to_all'). We'd still record a measured wait duration (~0ms), so we had accurate waiting-for-flow-tokens histograms. Part of cockroachdb#82743. Release note: None
6b82df9
to
4da2627
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @bananabrick and @sumeerbhola)
pkg/util/admission/work_queue.go
line 639 at r1 (raw file):
Previously, sumeerbhola wrote…
what about this case?
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 1 files at r2, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @bananabrick)
bors r+ |
Build succeeded: |
We previously did not record anything into {IO,CPU} wait queue histograms when work either bypassed admission control (because of the nature of the work, or when certain admission queues were disabled through cluster settings). This meant that our histogram percentiles were not accurate.
This problem didn't exist at the flow control level where work may not be subject to flow control depending on the mode selected ('apply_to_elastic', 'apply_to_all'). We'd still record a measured wait duration (~0ms), so we had accurate waiting-for-flow-tokens histograms.
Part of #82743.
Release note: None