Stage one of cgroup RFC #1627

fearful-symmetry · 2021-10-12T23:33:39Z

This is the Stage 1 component of the RFC process. Sorry for the delay between the stages, I got a bit swept up with other work for a bit.

Hopefully the proposed/example cgroup.yml file is correct, I wasn't completely sure how things should be formatted.

kgeller · 2021-10-13T17:51:48Z

Hi @fearful-symmetry ! No worries at all, there is truly no rush when it comes to the RFC process. It can go as quickly or as slowly as those involved need.

A couple notes:

If you could identify sponsors/subject matter experts to also review and participate, that would be great. We in ECS simply try to shepherd, and let the experts in these areas form the specifics.
I know the initial plan was to have a top level cgroup fieldset, but what do you think about putting this inside of process? We currently have a process.pgid, defined as Identifier of the group of processes the process belongs to. Now, I am far from an expert on cgroups, but that sounds like something similar?
In your description, you mention a couple times how we currently are only including cgroup metric fields that are common between v1 and v2. What happens if//when v1 gets deprecated -- do we think we will add more fields? Similarly, what happens if/when a v3 comes along? Ie. is this current list of fields the ones we want to be tracking because it's what we want or is it because those are the common ones at the current time?
The goal of ECS in general is for all different sources can effectively 'speak the same language' and I want to make sure that we go into this with the intention, and not so much of a how do we bridge this v1 v2 gap if that makes sense. I want to make sure that we have long lasting longevity and not just a stop-gap solution.

fearful-symmetry · 2021-10-13T22:36:08Z

If you could identify sponsors/subject matter experts to also review and participate

I think I've become the default SME on cgroups, but I've pinged Jaime anyway, since I think he's tinkered with this before. @kgeller

We currently have a process.pgid, defined as Identifier of the group of processes the process belongs to. Now, I am far from an expert on cgroups, but that sounds like something similar?

So, linux process groups (used largely for signalling) are conceptually different from cgroups (used for controlling resource access and usage), so that might be confusing.

What happens if//when v1 gets deprecated -- do we think we will add more fields?

I think that's an ECS philosophy question, and I'm not sure of the answer. My thinking was to include fields that are the most likely to be used for monitoring and alerting, and not to cram in everything that I technically could.

The need for some kind of standard here springs from the fact that we report cgroup metrics in a relatively transparent fashion, mirroring the hierarchies and controller names used by that particular cgroup. These names and hierarchies vary a great deal between V1 and V2, so if (theoretically) a V3 comes along, we'll probably have a similar need to report "valuable" metrics in a consistent fashion.

"hybrid" cgroup systems are probably going to be around for a while, since a cgroup V2-only system requires both a V2-only OS and a V2-only application. Right now all the big Linux LTS distros are either hybrid, or V1 only, and there will be a decent bit of pressure to keep up "hybrid" OS releases, as a V2-only OS will require changes to applications, config management, etc. Right now, the only distros shipping with V2-only are "bleeding edge" releases like Fedora. So, by this, measure, it is technically a stop-gap, but my sysadmin experience tells me that V1 is going to be around for a long, long time.

jsoriano

If you could identify sponsors/subject matter experts to also review and participate

I think I've become the default SME on cgroups, but I've pinged Jaime anyway, since I think he's tinkered with this before. @kgeller

@fearful-symmetry maybe you should add us to the "People" section of the RFC.

We currently have a process.pgid, defined as Identifier of the group of processes the process belongs to. Now, I am far from an expert on cgroups, but that sounds like something similar?

So, linux process groups (used largely for signalling) are conceptually different from cgroups (used for controlling resource access and usage), so that might be confusing.

I am on the fence here. On one side I think these metrics are usually going to be reported along with process data, so they could fit under the process group for practical reasons. But on the other side, they are not 1:1 related to process, many processes could belong to the same cgroup, and actually there could be a collector collecting only cgroup metrics, without associating them to any process. They could be also collected for containers, without any association with processes. Probably putting them outside of process is the safest path.

Or, would it make sense to think on a new fieldset for process groups, and put these metrics there? Though this could be confusing and can be difficult to find common fields. As Alex mentioned, only in Linux there are already different kinds of groups of processes, if we start thinking about different OSs the thing can go out of hand quickly.

As a side topic, perhaps we can also think on having some reusable fieldset for common resources, we already added cpu and memory metrics to host, and now these ones to cgroups. Pinging @kaiyan-sheng here that she has been thinking about this for inventary purpouses.

rfcs/text/0028-cgroups.md

rfcs/text/0028/cgroups.yml

jsoriano · 2021-10-14T11:05:43Z

but my sysadmin experience tells me that V1 is going to be around for a long, long time.

+1, I wouldn't make decisions thinking on the removal of v1 🙂

kgeller · 2021-10-15T17:01:29Z

@fearful-symmetry maybe you should add us to the "People" section of the RFC.

That would be awesome! Otherwise I can add that in when getting everything ready for merge.

I think that's an ECS philosophy question, and I'm not sure of the answer. My thinking was to include fields that are the most likely to be used for monitoring and alerting, and not to cram in everything that I technically could.

I think this is perfect. Adding in those most-likely-to-be-used fields is exactly what we want. I just had hesitation that we were only going to be adding the fields convenient from v1 & v2 overlap, but that is not the case, so yay!

@fearful-symmetry @jsoriano thank you both for the insight into where these fields should sit. I totally see where you're coming from and can see cgroups getting their own top level field set. If @kaiyan-sheng has other inventory ideas from the work with other metrics, that would be cool to get insight on as well.

kaiyan-sheng · 2021-10-15T19:24:12Z

@jsoriano @fearful-symmetry @kgeller We currently have these basic monitoring metrics (like CPU, disk...) added under host, and we are also working on adding them under container. Will it be good to have the common fields defined as separate field sets and reuse them if it's applicable?

fearful-symmetry · 2021-10-18T19:25:32Z

Will it be good to have the common fields defined as separate field sets and reuse them if it's applicable?

Perhaps? We might want a few "basic" ones like memory, cpu, etc, but I don't think it should be a requirement.

fearful-symmetry · 2021-10-19T21:44:48Z

Made a couple of the suggested changes. Apologies for the delay, I got sucked into bug squashing again.

As far as merging/combining this with other fields and metrics, my concern is the weird way that cgroups connect to other things. A cgroup can be linked to a single process, a group of processes, a container, etc. We risk "over linking" the data if we start connecting things to certain cgroups without being 100% sure that that connection makes sense.

kaiyan-sheng · 2021-10-26T14:13:18Z

@fearful-symmetry Sounds good to me to keep cgroups fields separate for now. Thanks!

fearful-symmetry · 2021-11-04T17:15:48Z

@kgeller anything else we need here?

kgeller · 2021-11-05T13:01:39Z

@fearful-symmetry I think we are all set from an ECS perspective, and are good to merge as long as your sponsor(s) / SME(s) are on board!

jsoriano

All good from my side, thanks!

start stage 1

29f0a5b

fearful-symmetry added the RFC label Oct 12, 2021

fearful-symmetry requested a review from a team October 12, 2021 23:33

fearful-symmetry requested a review from jsoriano October 13, 2021 22:36

jsoriano reviewed Oct 14, 2021

View reviewed changes

rfcs/text/0028-cgroups.md Show resolved Hide resolved

rfcs/text/0028/cgroups.yml Show resolved Hide resolved

change header, add version

cb66032

fearful-symmetry requested review from kaiyan-sheng, jsoriano and kgeller October 19, 2021 21:23

add people

76d961a

Merge branch 'main' into cgroup-s1

c4560d4

jsoriano approved these changes Nov 5, 2021

View reviewed changes

fearful-symmetry merged commit f69352c into elastic:main Nov 5, 2021

kgeller mentioned this pull request Nov 5, 2021

RFC 0028 - updating stage 1 advancement date #1650

Merged

ebeahan mentioned this pull request Nov 10, 2021

RFC 0028 Stage 1 - Add cgroup.* fields to experimental schema #1660

Merged

kgeller mentioned this pull request Aug 3, 2022

[RFC] Stage 0 - New Fields For Login #1957

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stage one of cgroup RFC #1627

Stage one of cgroup RFC #1627

fearful-symmetry commented Oct 12, 2021

kgeller commented Oct 13, 2021

fearful-symmetry commented Oct 13, 2021 •

edited

Loading

jsoriano left a comment

jsoriano commented Oct 14, 2021 •

edited

Loading

kgeller commented Oct 15, 2021

kaiyan-sheng commented Oct 15, 2021

fearful-symmetry commented Oct 18, 2021

fearful-symmetry commented Oct 19, 2021

kaiyan-sheng commented Oct 26, 2021

fearful-symmetry commented Nov 4, 2021

kgeller commented Nov 5, 2021

jsoriano left a comment

Stage one of cgroup RFC #1627

Stage one of cgroup RFC #1627

Conversation

fearful-symmetry commented Oct 12, 2021

kgeller commented Oct 13, 2021

fearful-symmetry commented Oct 13, 2021 • edited Loading

jsoriano left a comment

Choose a reason for hiding this comment

jsoriano commented Oct 14, 2021 • edited Loading

kgeller commented Oct 15, 2021

kaiyan-sheng commented Oct 15, 2021

fearful-symmetry commented Oct 18, 2021

fearful-symmetry commented Oct 19, 2021

kaiyan-sheng commented Oct 26, 2021

fearful-symmetry commented Nov 4, 2021

kgeller commented Nov 5, 2021

jsoriano left a comment

Choose a reason for hiding this comment

fearful-symmetry commented Oct 13, 2021 •

edited

Loading

jsoriano commented Oct 14, 2021 •

edited

Loading