-
Notifications
You must be signed in to change notification settings - Fork 419
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stage one of cgroup RFC #1627
Stage one of cgroup RFC #1627
Conversation
Hi @fearful-symmetry ! No worries at all, there is truly no rush when it comes to the RFC process. It can go as quickly or as slowly as those involved need. A couple notes:
|
I think I've become the default SME on cgroups, but I've pinged Jaime anyway, since I think he's tinkered with this before. @kgeller
So, linux process groups (used largely for signalling) are conceptually different from cgroups (used for controlling resource access and usage), so that might be confusing.
I think that's an ECS philosophy question, and I'm not sure of the answer. My thinking was to include fields that are the most likely to be used for monitoring and alerting, and not to cram in everything that I technically could. The need for some kind of standard here springs from the fact that we report cgroup metrics in a relatively transparent fashion, mirroring the hierarchies and controller names used by that particular cgroup. These names and hierarchies vary a great deal between V1 and V2, so if (theoretically) a V3 comes along, we'll probably have a similar need to report "valuable" metrics in a consistent fashion. "hybrid" cgroup systems are probably going to be around for a while, since a cgroup V2-only system requires both a V2-only OS and a V2-only application. Right now all the big Linux LTS distros are either hybrid, or V1 only, and there will be a decent bit of pressure to keep up "hybrid" OS releases, as a V2-only OS will require changes to applications, config management, etc. Right now, the only distros shipping with V2-only are "bleeding edge" releases like Fedora. So, by this, measure, it is technically a stop-gap, but my sysadmin experience tells me that V1 is going to be around for a long, long time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you could identify sponsors/subject matter experts to also review and participate
I think I've become the default SME on cgroups, but I've pinged Jaime anyway, since I think he's tinkered with this before. @kgeller
@fearful-symmetry maybe you should add us to the "People" section of the RFC.
We currently have a process.pgid, defined as Identifier of the group of processes the process belongs to. Now, I am far from an expert on cgroups, but that sounds like something similar?
So, linux process groups (used largely for signalling) are conceptually different from cgroups (used for controlling resource access and usage), so that might be confusing.
I am on the fence here. On one side I think these metrics are usually going to be reported along with process data, so they could fit under the process
group for practical reasons. But on the other side, they are not 1:1 related to process, many processes could belong to the same cgroup, and actually there could be a collector collecting only cgroup metrics, without associating them to any process. They could be also collected for containers, without any association with processes. Probably putting them outside of process
is the safest path.
Or, would it make sense to think on a new fieldset for process groups, and put these metrics there? Though this could be confusing and can be difficult to find common fields. As Alex mentioned, only in Linux there are already different kinds of groups of processes, if we start thinking about different OSs the thing can go out of hand quickly.
As a side topic, perhaps we can also think on having some reusable fieldset for common resources, we already added cpu and memory metrics to host
, and now these ones to cgroups
. Pinging @kaiyan-sheng here that she has been thinking about this for inventary purpouses.
+1, I wouldn't make decisions thinking on the removal of v1 🙂 |
That would be awesome! Otherwise I can add that in when getting everything ready for merge.
I think this is perfect. Adding in those most-likely-to-be-used fields is exactly what we want. I just had hesitation that we were only going to be adding the fields convenient from v1 & v2 overlap, but that is not the case, so yay! @fearful-symmetry @jsoriano thank you both for the insight into where these fields should sit. I totally see where you're coming from and can see |
@jsoriano @fearful-symmetry @kgeller We currently have these basic monitoring metrics (like CPU, disk...) added under |
Perhaps? We might want a few "basic" ones like |
Made a couple of the suggested changes. Apologies for the delay, I got sucked into bug squashing again. As far as merging/combining this with other fields and metrics, my concern is the weird way that cgroups connect to other things. A cgroup can be linked to a single process, a group of processes, a container, etc. We risk "over linking" the data if we start connecting things to certain cgroups without being 100% sure that that connection makes sense. |
@fearful-symmetry Sounds good to me to keep cgroups fields separate for now. Thanks! |
@kgeller anything else we need here? |
@fearful-symmetry I think we are all set from an ECS perspective, and are good to merge as long as your sponsor(s) / SME(s) are on board! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All good from my side, thanks!
This is the Stage 1 component of the RFC process. Sorry for the delay between the stages, I got a bit swept up with other work for a bit.
Hopefully the proposed/example
cgroup.yml
file is correct, I wasn't completely sure how things should be formatted.