-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ignore kernel memory settings #2840
Conversation
I think this makes sense to do; even if runc would continue supporting it, it would depend on the host/kernel if it would even work, and with it being deprecated in the kernel, it would be kinda "living on borrowed time" for any user using this feature |
@opencontainers/runc-maintainers we need to decide whether this makes sense, and if it is rc94 or post-1.0 material. I don't think it can break any users that SET kernel memory. The only breakage I can think of is when someone is expecting kmem usage > 0 and with not enabling the acct it will be 0 which might disappoint some monitoring tools. From the POV of disappointment, seems it's better to do before 1.0 (so in the worst case scenario we can revert it). From the feature freeze POV, this looks like a feature and thus should not be included. Any opinions welcome 🙏 |
The (less radical) alternative to this would be to enable kmem accounting (if limits are set) but don't apply any kmem limits. This incurs some runtime overhead in the kernel, but as there are no limits there are no bad consequences described above. |
I'm still +1 to this; better to rip off the bandaid sooner than later, as it will need to happen at some point
|
close/reopen to kick CI @opencontainers/runc-maintainers PTAL (I think this is suitable for rc94). |
This is somewhat radical approach to deal with kernel memory. Per-cgroup kernel memory limiting was always problematic. A few examples: - older kernels had bugs and were even oopsing sometimes (best example is RHEL7 kernel); - kernel is unable to reclaim the kernel memory so once the limit is hit a cgroup is toasted; - some kernel memory allocations don't allow failing. In addition to that, - users don't have a clue about how to set kernel memory limits (as the concept is much more complicated than e.g. [user] memory); - different kernels might have different kernel memory usage, which is sort of unexpected; - cgroup v2 do not have a [dedicated] kmem limit knob, and thus runc silently ignores kernel memory limits for v2; - kernel v5.4 made cgroup v1 kmem.limit obsoleted (see torvalds/linux@0158115f702b). In view of all this, and as the runtime-spec lists memory.kernel and memory.kernelTCP as OPTIONAL, let's ignore kernel memory limits (for cgroup v1, same as we're already doing for v2). This should result in less bugs and better user experience. The only bad side effect from it might be that stat can show kernel memory usage as 0 (since the accounting is not enabled). [v2: add a warning in specconv that limits are ignored] Signed-off-by: Kir Kolyshkin <[email protected]>
This is somewhat radical approach to deal with kernel memory.
Per-cgroup kernel memory limiting was always problematic. A few
examples:
is RHEL7 kernel);
hit a cgroup is toasted;
In addition to that,
(as the concept is much more complicated than e.g. [user] memory);
which is sort of unexpected;
runc silently ignores kernel memory limits for v2;
torvalds/linux@0158115f702b).
In view of all this, and as the runtime-spec lists memory.kernel
and memory.kernelTCP as OPTIONAL (and NOT RECOMMENDED
since opencontainers/runtime-spec#1093), let's silently ignore kernel memory
limits (for cgroup v1, same as we're already doing for v2).
This should result in less bugs and better user experience.
The only bad side effect from it might be that stat can show kernel
memory usage as 0 (since the accounting is not enabled).
See also:
Closes: #2594