-
Notifications
You must be signed in to change notification settings - Fork 39.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
More seat metrics for APF #105873
More seat metrics for APF #105873
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@MikeSpreitzer - can you split the commits into separate PRs?
I like the second commit, but i would like to discuss the first a bit more.
name string // varies in tests of fighting controllers | ||
clock clock.PassiveClock | ||
queueSetFactory fq.QueueSetFactory | ||
reqsObsPairGenerator metrics.TimedObserverPairGenerator |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given we're touching this code - I would like to ask questions that I never asked before.
Why do we need this "TimedObservers" (and related) concepts?
Why the P&F code can't just use metrics as everything else is using by simply hard-coding metrics and exposing them as everything else does?
I think this code would get much easier to follow (not just this PR, but the whole P&F code), if we would simplify it. And I've never really understood why it has to be so complicated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Somewhat related question:
Maybe I'm missing something - but why do we even try to implement watermark metrics ourselves in the code?
Conceptually - if we want to get highest/lowest value over time - it's not something that should be done via metrics themselves. It should be the task for metrics engine/processor etc.
i.e. we report the metric that shows current value, and we can easily compute the higest/lowest based on that (all metrics agents expose queries for that).
If I'm not missing something above - I would really like us to get rid of those metrics and not do the job of metrics engine as part of P&F...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Watermarking has been there since before I got involved. I found that the max-in-flight filter was already watermarking the number in flight. That makes sense, because this is a gauge that can vary much more quickly than we can expect scrapes to happen.
Oddly, the watermarking that the max-in-flight filter has been doing is itself something that surely will not be scraped frequently enough. The watermarking in that filter is only over the last second, and we certainly can expect the apiserver metrics scraping period to be longer than one second!
We could try to pick a better watermarking period, but that would be a difficult exercise in satisfying everybody when we do not even know who everybody is. What the watermark histogram does is take observations of watermarks, so that no scraping period misses watermark observations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't fully agree, but I don't want to block this PR on it either (sorry for sitting so long on it so far).
It's not introducing anything new - and we should discuss how to improve it separately.
I opened #106302 to discuss that further.
/assign @tkashem |
a4c9075
to
154bf6a
Compare
The force-push to 154bf6a is a rebase onto master. |
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: MikeSpreitzer, wojtek-t The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
What type of PR is this?
/kind feature
What this PR does / why we need it:
This PR adds some more metrics to API Priority and Fairness regarding seat usage.
Which issue(s) this PR fixes:
Fixes #
Special notes for your reviewer:
This PR is two commits, one adding the seat_count metrics and one adding the watch_count metrics, so that they can be viewed separately.
Does this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:
/sig api-machinery
/sig instrumentation
/cc @wojtek-t
/cc @deads2k
/cc @tkashem
/cc @lavalamp