-
Notifications
You must be signed in to change notification settings - Fork 403
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
containerd should be using systemd cgroup hierarchy #471
Comments
just appending:
at the end of the config should work too. |
the tricky part is that once this is done the following KubeletConfiguration must be passed to kubeadm:
@fabriziopandini @vincepri @detiber for comments on CAPI. |
@codenrhoden - We need to fix this. |
So this sounds like we have to coordinate the fix with kubeadm at the same time? Or if we have the Cgroup setup through systemd for future images, and kubeadm does not pass the needed KubeletConfiguration, will everything still work as before? |
if we add the there are a couple of options:
after some though, i'm leaning towards 2 but would appreciate more opinions here. |
If we configure it with Cluster API: Is there a last option in that we set the deprecated cgroup driver flag in /etc/sysconfig/kubelet such that CAPI doesn't need to do it ? |
its quite unfortunate but all users should do this setting unless kubeadm does it by default and given it runs the kubelet with systemd. which creates this period of breaking users that were using
cgroup drivers should be completely ignored on Windows as there is no such thing there, but we found a bug where the kubelet errors on Windows if the driver reported by
the kubelet's |
opening the wider discussion: |
@neolit123 for containerd
I found default systemd_cgroup section is not
|
the docs may be out of date. In any case, after some discussion, we've settled on the following:
|
The field in the current docs works for me. Without it the node failed.
Is this a recent containerd change?
|
@neolit123 Im sorry to confused you.
forget it |
I'm trying to better understand what is the expected UX in CAPI after this change lands in the image builder AFAIK currently it is not possible to change the KubeletConfiguration for the users (see kubernetes-sigs/cluster-api#1584), so I assume we should still rely on ExtraArgs, which is not ideal. Few options:
Also, after the change is implemented in the image builder is there any expectation on how the change would impact the images published by the providers e.g.
|
IMO, the images should be versioned with k8s.
|
@fabriziopandini , for now, we should use /etc/sysconfig given that's functionally equivalent to extraArgs, then pivot to CAPI configuring it in v1alpha4 when we support Kubelet component config via whatever mechanism we have. Will be tracking in the node agent bits. |
This issue was discussed during CAPI office hours on the 20th of January, and the outcomes of the discussion are captured in kubernetes/kubeadm#2376. TL;DR; in order to coordinate the change among all the involved parties and to provide a clean upgrade path for the users, we should ensure that image builder configures containerd for using the systemd cgroup driver as a default for images with Kubernetes version >= v1.21 Please let me know if there are problems with the above requirements; I will be happy to help, but I have very little knowledge of the image builder... |
Thanks for the follow-up @fabriziopandini, and stating clearly what we need to do. There are no problems with the requirement, it's definitely something we can do. |
@fabriziopandini @randomvariable I could use a little help in understanding why my TOML changes don't appear to work... I have the ansible worked out to only apply the changes for K8s >= v1.21.0. My resulting version = 2
[plugins]
[plugins."io.containerd.grpc.v1.cri"]
sandbox_image = "k8s.gcr.io/pause:3.2"
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
SystemdCgroup = true But when I check to see if containerd has picked up the change, I don't see anything added to the "runc.options" section...
I know it's reading |
@neolit123 ^^ |
as commented above, placing it at the bottom works for me: but i got it to work by inlining EDIT: TOML is ok, but i would have preferred if they support more formats like YAML... |
Thanks @neolit123 I don't know why, but I'm still struggling with this. Can I confirm that I should see the Cgroup entry be listed out when I do a [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
[plugins."io.containerd.grpc.v1.cri".cni]
bin_dir = "/opt/cni/bin"
conf_dir = "/etc/cni/net.d" |
unless i'm mistaken, i think if you have applied the change and restarted containerd using something like
|
Thanks, I'll check that out. My understanding (and observation) is that |
could be, i haven't tried it. |
|
i've sent the PR for moving to systemd in kubeadm 1.21: |
@voor |
None of the code samples provided in this thread have set it so that containerd is using systemd cgroups, but the kubelet change definitely works, thus resulting in kubeadm not working because they're not aligned. |
@codenrhoden is this effort still in track for v1.21? |
I tried this again, using containerd 1.4.4, and I still get the same behavior. no matter what I put in the config file, it doesn't register. I think I'd like to pair up with @neolit123 when he has time to show what I'm going and figure it out. I know it's parsing the config file and applying other changes from the config file. It's very strange. |
@codenrhoden did you try |
@neolit123 I have, yes. no difference there. |
containerd/containerd#4900 (comment) looks like both i'm not that familiar with containerd myself, but this looks like something that they must fix. all we care about is adding:
at the end of the config. |
Wow... Okay, that explains a lot. I'm just running around in circles. :) I can verify this with the methods in that comment. Thanks! |
What steps did you take and what happened:
[A clear and concise description on how to REPRODUCE the bug.]
As per https://kubernetes.io/docs/setup/production-environment/container-runtimes/, there should be only be a single cgroup hiearchy running on the OS. We do not configure containerd to run using the systemd hierarchy, such that Kubernetes will have an incorrect view of resource utilisation.
What did you expect to happen:
ContainerD should be configured to use systemd hierarchy
Anything else you would like to add:
[Miscellaneous information that will assist in solving the issue.]
ContainerD config.toml should have something along the lines of
Environment:
Project (Image Builder for Cluster API, kube-deploy/imagebuilder, konfigadm):
Additional info for Image Builder for Cluster API related issues:
/etc/os-release
, orcmd /c ver
):kubectl version
):/kind bug
[One or more /area label. See https://github.com/kubernetes-sigs/cluster-api/labels?q=area for the list of labels]
The text was updated successfully, but these errors were encountered: