Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

libct/cg/fs2.GetStats() improvements #2873

Merged
merged 7 commits into from
Apr 19, 2021

Conversation

kolyshkin
Copy link
Contributor

@kolyshkin kolyshkin commented Mar 26, 2021

Various GetStats() improvements, mostly to address cgroup v2 compatibility issues.
NOTE that this functionality is not used by runc itself.

1. libct/cg/fs2/memory: fix swap reporting

cgroup v1 reports combined mem+swap in stats.MemoryStats.SwapUsage.
In cgroup v2, swap is separate.

For the sake of compatibility, make v2 report mem+swap as well.
This also includes Limit.

2. libct/cg/fs[2]/getMemoryData[V2]: optimize

Existing code ignores ENOENT in case we're reading data from
controls that might not be enabled. While this is correct, the code
can be improved:

  1. Check name != "" instead of moduleName != "memory", as these checks
    are equivalent but the new one is faster.

  2. It does not make sense to ignore subsequent errors -- if the control
    is not available, we won't hit this code path.

  3. Add a comment explaining why we ignore the error.

3. libct/cg/fs2/memory.Stat: add usage for root cgroup

There is no memory.{current,max} for the root node in cgroup v2, and
thus stats for "/sys/fs/cgroup" return an error.

The same thing works with cgroup v1 (as there are
memory.{usage,limit}_in_bytes files in the root node).

Emulate stats for /sys/fs/cgroup by getting numbers from
/proc/self/meminfo.

[Initially, I wanted to avoid parsing yet another /proc file and
instead mock some numbers using data from memory.stat but was
unable to come up with formulae that make sense.]

4. libct/cg/fs2.Stat: always call statCpu

Giuseppe found out that cpu.stat for a cgroup is available even if
the cpu controller is not enabled for it. So, let's call statCpu
regradress, and ignore ENOENT.

5. libct/cg/fs2/getPidsWithoutController: optimize

It is inefficient to create an associative map for the whole purpose of
counting the number of elements in it, especially if the elements are
all unique. It uses more CPU than necessary and creates some work for
the garbage collector.

The file we read contains PIDs and newlines, and the easiest/fastest way
to get the number of PIDs is just to count the newlines.

6. libct/cg/fs2.statPids: fall back directly

When getting pids stats, instead of checking whether the pids controller
is available, let's use a fall back function in case pids.current does
not exist. This simplifies the logic in fs2.GetStats.

7. libct/cg/fs2.Stat: don't look for available controllers

Some controllers might still have stats available even if they are
disabled (this is definitely so for cpu.stat -- see earlier commit).

Some stat methods might implement sensible fallbacks (see previous
commit for statPids.

In the view of all that, it makes sense to not check if a particular
controller is available, but rather ignore ENOENT from it.

@kolyshkin
Copy link
Contributor Author

@giuseppe PTAL

giuseppe
giuseppe previously approved these changes Mar 26, 2021
Copy link
Member

@giuseppe giuseppe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks,

LGTM

@mrunalp
Copy link
Contributor

mrunalp commented Mar 26, 2021

I thought that kernel was adding support for the root level memory stats for cgroups v2 cc: @giuseppe

@kolyshkin
Copy link
Contributor Author

I thought that kernel was adding support for the root level memory stats for cgroups v2 cc: @giuseppe

If you're talking about memory.stat, it is present for the root cgroup, but I was not able to come up with a decent way to deduce usage from there.

If you mean memory.current, I have modified the last patch to check for it first, even for the root cgroup.

@kolyshkin
Copy link
Contributor Author

CI failure is a known flake (#2805). CI restarted.

cgroup v1 reports combined mem+swap in stats.MemoryStats.SwapUsage.
In cgroup v2, swap is separate.

For the sake of compatibility, make v2 report mem+swap as well.
This also includes Limit.

Signed-off-by: Kir Kolyshkin <[email protected]>
Existing code ignores ENOENT error in case we're reading data from
controls that might not be enabled. While this is correct, the code
can be improved:

1. Check name != "" instead of moduleName != "memory", as these checks
   are equivalent but the new one is faster.

2. It does not make sense to ignore subsequent errors -- if the control
   is not available, we won't hit this codepath.

3. Add a comment explaining why we ignore the error.

Signed-off-by: Kir Kolyshkin <[email protected]>
There is no memory.{current,max} for the root node in cgroup v2, and
thus stats for "/sys/fs/cgroup" return an error.

The same thing works with cgroup v1 (as there are
memory.{usage,limit}_in_bytes files in the root node).

Emulate stats for /sys/fs/cgroup by getting numbers from
/proc/self/meminfo.

NOTE that both memory.current (in cgroup v2) and memory.usage_in_bytes
(in cgroup v1) include page cache etc into the number, so we do the same
when calculating memory usage (as opposed to number reported by "free",
which excludes page cache and buffers).

[v2: check for cgroup files first, as future kernels might add it]
[v3: don't subtract cache from mem_used, simplifying the logic]

[Initially, I wanted to avoid parsing yet another /proc file and
instead mock some numbers using data from memory.stat but was
unable to come up with formulae that make sense.]

Signed-off-by: Kir Kolyshkin <[email protected]>
Giuseppe found out that cpu.stat for a cgroup is available even if
the cpu controller is not enabled for it. So, let's call statCpu
regradress, and ignore ENOENT.

Signed-off-by: Kir Kolyshkin <[email protected]>
It is inefficient to create an associative map for the whole purpose of
counting the number of elements in it, especially if the elements are
all unique. It uses more CPU than necessary and creates some work for
the garbage collector.

The file we read contains PIDs and newlines, and the easiest/fastest way
to get the number of PIDs is just to count the newlines.

Signed-off-by: Kir Kolyshkin <[email protected]>
When getting pids stats, instead of checking whether the pids controller
is available, let's use a fall back function in case pids.current does
not exist. This simplifies the logic in fs2.GetStats.

Signed-off-by: Kir Kolyshkin <[email protected]>
@kolyshkin kolyshkin changed the title libct/cg/fs2/memory: fix Stat libct/cg/fs2.GetStats() improvements Mar 31, 2021
Some controllers might still have stats available even if they are
disabled (this is definitely so for cpu.stat -- see earlier commit).

Some stat methods might implement sensible fallbacks (see previous
commit for statPids.

In the view of all that, it makes sense to not check if a particular
controller is available, but rather ignore ENOENT from it.

Signed-off-by: Kir Kolyshkin <[email protected]>
@@ -99,9 +103,15 @@ func statMemory(dirPath string, stats *cgroups.Stats) error {
if err != nil {
return err
}
// As cgroup v1 reports SwapUsage values as mem+swap combined,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we expose a separate field that returns just the swap for consumers? (Not a blocker for this PR)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alas, the field that holds swap+mem usage is inconveniently called "Swap".

So, we could add something like "SwapOnly" (for both v1 and v2) or "SwapV2" (for v2 only), or something entirely different -- I am very open to suggestions.

@kolyshkin
Copy link
Contributor Author

This helps cadvisor with cgroupv2 (kubernetes/kubernetes#100631) so I'd like this to go in rc94. This won't cause any regressions in runc per se as it does not use GetStats.

@kolyshkin kolyshkin added this to the 1.0.0-rc94 milestone Apr 15, 2021
@kolyshkin
Copy link
Contributor Author

@AkihiroSuda PTAL

@AkihiroSuda AkihiroSuda merged commit ba257d2 into opencontainers:master Apr 19, 2021
@Zheaoli
Copy link
Contributor

Zheaoli commented Nov 21, 2023

@kolyshkin Hello, I got some problem here

  1. Seems statsFromMeminfo will never triggered by runc it self right?
  2. How should I test this statsFromMeminfo function, seems I need to set up the /sys/fs/cgroup right?

@Zheaoli
Copy link
Contributor

Zheaoli commented Nov 27, 2023

@kolyshkin Hello, I got some problem here

  1. Seems statsFromMeminfo will never triggered by runc it self right?
  2. How should I test this statsFromMeminfo function, seems I need to set up the /sys/fs/cgroup right?

@kolyshkin ping QAQ

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants