Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

disable systemd-binfmt.service #3511

Merged
merged 1 commit into from
Feb 12, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 4 additions & 2 deletions images/base/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -86,10 +86,12 @@ RUN echo "Installing Packages ..." \
&& echo "ReadKMsg=no" >> /etc/systemd/journald.conf \
&& ln -s "$(which systemd)" /sbin/init

RUN echo "Enabling services ... " \
# NOTE: systemd-binfmt.service will register things into binfmt_misc which is kernel-global
RUN echo "Enabling / Disabling services ... " \
&& systemctl enable kubelet.service \
&& systemctl enable containerd.service \
&& systemctl enable undo-mount-hacks.service
&& systemctl enable undo-mount-hacks.service \
&& systemctl mask systemd-binfmt.service
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it seems something happened with this in bookworm, a quick search on google show several hits microsoft/WSL#8843

AFAIK this reads from a folder, can we have any side effect of doing this hammer approach or should we try to be more precise and understand what is exactly breaking this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, I see, it screws the list entirely #3510

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I imagine then that the best way is to say the docker container must use the one existing in the host

/lgtm

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so often binfmt_misc is intentionally set by containers (see e.g. our multi-arch build setup), but kind nodes shouldn't be doing it, and python scripts should keep using shebangs and not depend on this anyhow ...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per https://systemd.io/CONTAINER_INTERFACE/ (execution environment number 2) /proc/sys being a read-only mount is the recommended way to tell systemd not to change things. However as that mentions also making /proc/sys/net writable may be desirable (and likely would be needed for many k8s workloads).

It likely needs more thought than will fit in a review comment, so this targeted workaround makes sense for now.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the base image comments link to this page -- We have other similar workarounds for similar reasons, we're really stretching "systemd in a container" when we go --privileged but without it we can't get the "kubernetes 'in' a container" part. Previously in practice the key part was /sys mounted ro (mentioned further down as a trigger for udev) which we do.


RUN echo "Ensuring /etc/kubernetes/manifests" \
&& mkdir -p /etc/kubernetes/manifests
Expand Down
Loading