-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
libct/cg/fs2.GetStats() improvements #2873
libct/cg/fs2.GetStats() improvements #2873
Conversation
@giuseppe PTAL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks,
LGTM
I thought that kernel was adding support for the root level memory stats for cgroups v2 cc: @giuseppe |
6d55a3e
to
c10038b
Compare
If you're talking about If you mean |
CI failure is a known flake (#2805). CI restarted. |
c10038b
to
ff4adc1
Compare
cgroup v1 reports combined mem+swap in stats.MemoryStats.SwapUsage. In cgroup v2, swap is separate. For the sake of compatibility, make v2 report mem+swap as well. This also includes Limit. Signed-off-by: Kir Kolyshkin <[email protected]>
Existing code ignores ENOENT error in case we're reading data from controls that might not be enabled. While this is correct, the code can be improved: 1. Check name != "" instead of moduleName != "memory", as these checks are equivalent but the new one is faster. 2. It does not make sense to ignore subsequent errors -- if the control is not available, we won't hit this codepath. 3. Add a comment explaining why we ignore the error. Signed-off-by: Kir Kolyshkin <[email protected]>
There is no memory.{current,max} for the root node in cgroup v2, and thus stats for "/sys/fs/cgroup" return an error. The same thing works with cgroup v1 (as there are memory.{usage,limit}_in_bytes files in the root node). Emulate stats for /sys/fs/cgroup by getting numbers from /proc/self/meminfo. NOTE that both memory.current (in cgroup v2) and memory.usage_in_bytes (in cgroup v1) include page cache etc into the number, so we do the same when calculating memory usage (as opposed to number reported by "free", which excludes page cache and buffers). [v2: check for cgroup files first, as future kernels might add it] [v3: don't subtract cache from mem_used, simplifying the logic] [Initially, I wanted to avoid parsing yet another /proc file and instead mock some numbers using data from memory.stat but was unable to come up with formulae that make sense.] Signed-off-by: Kir Kolyshkin <[email protected]>
Giuseppe found out that cpu.stat for a cgroup is available even if the cpu controller is not enabled for it. So, let's call statCpu regradress, and ignore ENOENT. Signed-off-by: Kir Kolyshkin <[email protected]>
It is inefficient to create an associative map for the whole purpose of counting the number of elements in it, especially if the elements are all unique. It uses more CPU than necessary and creates some work for the garbage collector. The file we read contains PIDs and newlines, and the easiest/fastest way to get the number of PIDs is just to count the newlines. Signed-off-by: Kir Kolyshkin <[email protected]>
When getting pids stats, instead of checking whether the pids controller is available, let's use a fall back function in case pids.current does not exist. This simplifies the logic in fs2.GetStats. Signed-off-by: Kir Kolyshkin <[email protected]>
ff4adc1
to
85416b8
Compare
Some controllers might still have stats available even if they are disabled (this is definitely so for cpu.stat -- see earlier commit). Some stat methods might implement sensible fallbacks (see previous commit for statPids. In the view of all that, it makes sense to not check if a particular controller is available, but rather ignore ENOENT from it. Signed-off-by: Kir Kolyshkin <[email protected]>
@@ -99,9 +103,15 @@ func statMemory(dirPath string, stats *cgroups.Stats) error { | |||
if err != nil { | |||
return err | |||
} | |||
// As cgroup v1 reports SwapUsage values as mem+swap combined, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we expose a separate field that returns just the swap for consumers? (Not a blocker for this PR)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alas, the field that holds swap+mem usage is inconveniently called "Swap".
So, we could add something like "SwapOnly" (for both v1 and v2) or "SwapV2" (for v2 only), or something entirely different -- I am very open to suggestions.
This helps cadvisor with cgroupv2 (kubernetes/kubernetes#100631) so I'd like this to go in rc94. This won't cause any regressions in runc per se as it does not use GetStats. |
@AkihiroSuda PTAL |
@kolyshkin Hello, I got some problem here
|
@kolyshkin ping QAQ |
Various
GetStats()
improvements, mostly to address cgroup v2 compatibility issues.NOTE that this functionality is not used by runc itself.
1. libct/cg/fs2/memory: fix swap reporting
cgroup v1 reports combined mem+swap in stats.MemoryStats.SwapUsage.
In cgroup v2, swap is separate.
For the sake of compatibility, make v2 report mem+swap as well.
This also includes Limit.
2. libct/cg/fs[2]/getMemoryData[V2]: optimize
Existing code ignores
ENOENT
in case we're reading data fromcontrols that might not be enabled. While this is correct, the code
can be improved:
Check
name != ""
instead ofmoduleName != "memory"
, as these checksare equivalent but the new one is faster.
It does not make sense to ignore subsequent errors -- if the control
is not available, we won't hit this code path.
Add a comment explaining why we ignore the error.
3. libct/cg/fs2/memory.Stat: add usage for root cgroup
There is no memory.{current,max} for the root node in cgroup v2, and
thus stats for "/sys/fs/cgroup" return an error.
The same thing works with cgroup v1 (as there are
memory.{usage,limit}_in_bytes files in the root node).
Emulate stats for /sys/fs/cgroup by getting numbers from
/proc/self/meminfo.
[Initially, I wanted to avoid parsing yet another /proc file and
instead mock some numbers using data from memory.stat but was
unable to come up with formulae that make sense.]
4. libct/cg/fs2.Stat: always call statCpu
Giuseppe found out that cpu.stat for a cgroup is available even if
the cpu controller is not enabled for it. So, let's call statCpu
regradress, and ignore ENOENT.
5. libct/cg/fs2/getPidsWithoutController: optimize
It is inefficient to create an associative map for the whole purpose of
counting the number of elements in it, especially if the elements are
all unique. It uses more CPU than necessary and creates some work for
the garbage collector.
The file we read contains PIDs and newlines, and the easiest/fastest way
to get the number of PIDs is just to count the newlines.
6. libct/cg/fs2.statPids: fall back directly
When getting pids stats, instead of checking whether the pids controller
is available, let's use a fall back function in case pids.current does
not exist. This simplifies the logic in fs2.GetStats.
7. libct/cg/fs2.Stat: don't look for available controllers
Some controllers might still have stats available even if they are
disabled (this is definitely so for cpu.stat -- see earlier commit).
Some stat methods might implement sensible fallbacks (see previous
commit for statPids.
In the view of all that, it makes sense to not check if a particular
controller is available, but rather ignore ENOENT from it.