-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KEP-4786: direct cgroup status collection on Node #4792
base: master
Are you sure you want to change the base?
Conversation
linxiulei
commented
Aug 19, 2024
- One-line PR description: Direct cgroup stats collection on Node
- Issue link: Direct cgroup stats collection #4786
- Other comments:
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: linxiulei The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/retitle KEP-4786: direct cgroup status collection on Node |
After enabling feature `PodAndContainerStatsFromCRI`, only | ||
[summary API][summary-api] invokes cAdvisor for stats of: | ||
|
||
* Root filesystem |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Image filesystem also.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pls correct me if I'm wrong. I think imagefs is taken care of by CRI, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at that code, we are using cadvisor for Availability and Capacity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but that cadvisor will be replaced to not use cadvisor
This KEP aims to eliminate the need to run cAdvisor with enablement of KEP-2371 | ||
for better performance and simplicity. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, IIUC, KEP-2371 is about implementing the missing bits from the CRI/CRI-API perspective and this KEP would actually remove the use of cadvisor wherever it's necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but not entirely removing the use of cadvisor. More specifically, this KEP will remove the background task that runs cadvisor routines but still call cadvisor code on demand.
### Goals | ||
|
||
* Improve performance in Kubelet without running cAdvisor. | ||
* Do not introduce breaking changes to the Summary API or eviction function. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this KEP will cover both cgroup v1 and v2?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it should only cover v2, as v1 is feature frozen
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does support both cgroup versions but it's not intentional. The implementation would take advantage of libraries that support collecting stats for both versions so it's agnostic to cgroup versions.
I think something we have to be wary about this approach is how metrics are undocumented/unintended GA APIs in kubernetes. While it seems natural to exclude cgroup stats collection for all cgroups other than the ones kubelet is aware of, there are likely users who are relying on cadvisor collecting those stats and will be upset by us dropping them. I wonder if the same performance gains can be made by adding a tunable in the kubelet configuration that allows a user to say which cgroups we collect metrics about. Then we could get data from users on whether they want the non-kube cgroups to have stats collected for them. WDYT @linxiulei |
The cgroups we will drop in this KEP are individual Pods' cgroups. I am not sure how configurable they are since they are all under */kubepod/ path. Also adding a tunable for this KEP will significantly increase the complexity so I'd refrain doing so.
Alternatively, we can make this KEP an opt-in feature, so users who still want non-kube cgroup stats, they can opt out this feature or until they find an alternative to collect non-kube cgroup stats, which I genuinely think should not be part of kubelet. |
pod cgroups like the pod slice, or the container scopes as well? theoretically, CRI stats KEP should cover the container scope piece (I don't think it does today but it should). if you're talking about the pod cgroup, then I still maintain there may be users relying on these metrics (along with any others we may be dropping, unfortunately) |
Sorry for the confusion. Let me clarify, currently there are following cgroup stats collected by kubelet and cAdvisor
This KEP won't drop any of them. However, Pod cgroups are collected by cAdvisor and CRI stats KEP (if enabled) at the same time. Therefore, this KEP removes the cAdvisor's collection by collecting what CRI stats KEP is not yet collecting. So after this KEP, kubelet collects
(here what I meant dropping pod cgroups incorrectly) And CRI stats collects
|