-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multiple containers in the same cgroup #3132
Comments
This comment has been minimized.
This comment has been minimized.
It seems that crun is equally affected (filed containers/crun#716). |
5 tasks
This was referenced Aug 24, 2021
kolyshkin
added a commit
to kolyshkin/runtime-spec
that referenced
this issue
Sep 27, 2021
It makes sense for runtime to reject a cgroup which is frozen (for both new and existing container), otherwise the runtime command will just end up stuck. It makes sense for runtime to make sure the cgroup for a new container is empty (i.e. there are no processes it in), and reject it otherwise. The scenario in which a non-empty cgroup is used for a new container has multiple problems, for example: * If two or more containers share the same cgroup, and each container has its own limits configured, the order of container starts ultimately determines whose limits will be effectively applied. * If two or more containers share the same cgroup, and one of containers is paused/unpaused, all others are paused, too. * If cgroup.kill is used to forcefully kill the container, it will also kill other processes that are not part of this container but merely belong to the same cgroup. * When a systemd cgroup manager is used, this becomes even worse. Such as, stop (or even failed start) of any container results in stopTransientUnit command being sent to systemd, and so (depending on unit properties) other containers can receive SIGTERM, be killed after a timeout etc. * Many other bad scenarios are possible, as the implicit assumption of 1:1 container:cgroup mapping is broken. opencontainers/runc#3132 containers/crun#716 Signed-off-by: Kir Kolyshkin <[email protected]>
kolyshkin
added a commit
to kolyshkin/runtime-spec
that referenced
this issue
Sep 28, 2021
It makes sense for runtime to reject a cgroup which is frozen (for both new and existing container), otherwise the runtime command (create/run/exec) may end up being stuck. It makes sense for runtime to make sure the cgroup for a new container is empty (i.e. there are no processes it in), and reject it otherwise. The scenario in which a non-empty cgroup is used for a new container has multiple problems, for example: * If two or more containers share the same cgroup, and each container has its own limits configured, the order of container starts ultimately determines whose limits will be effectively applied. * If two or more containers share the same cgroup, and one of containers is paused/unpaused, all others are paused, too. * If cgroup.kill is used to forcefully kill the container, it will also kill other processes that are not part of this container but merely belong to the same cgroup. * When a systemd cgroup manager is used, this becomes even worse. Such as, stop (or even failed start) of any container results in stopTransientUnit command being sent to systemd, and so (depending on unit properties) other containers can receive SIGTERM, be killed after a timeout etc. * Many other bad scenarios are possible, as the implicit assumption of 1:1 container:cgroup mapping is broken. opencontainers/runc#3132 containers/crun#716 Signed-off-by: Kir Kolyshkin <[email protected]>
kolyshkin
added a commit
to kolyshkin/runtime-spec
that referenced
this issue
Sep 28, 2021
It makes sense for runtime to reject a cgroup which is frozen (for both new and existing container), otherwise the runtime command (create/run/exec) may end up being stuck. It makes sense for runtime to make sure the cgroup for a new container is empty (i.e. there are no processes it in), and reject it otherwise. The scenario in which a non-empty cgroup is used for a new container has multiple problems, for example: * If two or more containers share the same cgroup, and each container has its own limits configured, the order of container starts ultimately determines whose limits will be effectively applied. * If two or more containers share the same cgroup, and one of containers is paused/unpaused, all others are paused, too. * If cgroup.kill is used to forcefully kill the container, it will also kill other processes that are not part of this container but merely belong to the same cgroup. * When a systemd cgroup manager is used, this becomes even worse. Such as, stop (or even failed start) of any container results in stopTransientUnit command being sent to systemd, and so (depending on unit properties) other containers can receive SIGTERM, be killed after a timeout etc. * Many other bad scenarios are possible, as the implicit assumption of 1:1 container:cgroup mapping is broken. opencontainers/runc#3132 containers/crun#716 Signed-off-by: Kir Kolyshkin <[email protected]>
kolyshkin
added a commit
to kolyshkin/runtime-spec
that referenced
this issue
Sep 30, 2021
It makes sense for runtime to reject a cgroup which is frozen (for both new and existing container), otherwise the runtime command (create/run/exec) may end up being stuck. It makes sense for runtime to make sure the cgroup for a new container is empty (i.e. there are no processes it in), and reject it otherwise. The scenario in which a non-empty cgroup is used for a new container has multiple problems, for example: * If two or more containers share the same cgroup, and each container has its own limits configured, the order of container starts ultimately determines whose limits will be effectively applied. * If two or more containers share the same cgroup, and one of containers is paused/unpaused, all others are paused, too. * If cgroup.kill is used to forcefully kill the container, it will also kill other processes that are not part of this container but merely belong to the same cgroup. * When a systemd cgroup manager is used, this becomes even worse. Such as, stop (or even failed start) of any container results in stopTransientUnit command being sent to systemd, and so (depending on unit properties) other containers can receive SIGTERM, be killed after a timeout etc. * Many other bad scenarios are possible, as the implicit assumption of 1:1 container:cgroup mapping is broken. opencontainers/runc#3132 containers/crun#716 Signed-off-by: Kir Kolyshkin <[email protected]>
kolyshkin
added a commit
to kolyshkin/runtime-spec
that referenced
this issue
Sep 30, 2021
It makes sense for runtime to reject a cgroup which is frozen (for both new and existing container), otherwise the runtime command (create/run/exec) may end up being stuck. It makes sense for runtime to make sure the cgroup for a new container is empty (i.e. there are no processes it in), and reject it otherwise. The scenario in which a non-empty cgroup is used for a new container has multiple problems, for example: * If two or more containers share the same cgroup, and each container has its own limits configured, the order of container starts ultimately determines whose limits will be effectively applied. * If two or more containers share the same cgroup, and one of containers is paused/unpaused, all others are paused, too. * If cgroup.kill is used to forcefully kill the container, it will also kill other processes that are not part of this container but merely belong to the same cgroup. * When a systemd cgroup manager is used, this becomes even worse. Such as, stop (or even failed start) of any container results in stopTransientUnit command being sent to systemd, and so (depending on unit properties) other containers can receive SIGTERM, be killed after a timeout etc. * Many other bad scenarios are possible, as the implicit assumption of 1:1 container:cgroup mapping is broken. opencontainers/runc#3132 containers/crun#716 Signed-off-by: Kir Kolyshkin <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Currently runc allows multiple containers to share the same cgroup. While such shared configuration might be OK, there are some issues.
runc init
stuck inside a frozen cgroup.All this may lead to various hard-to-debug situations in production (runc init stuck, cgroup removal error, wrong resource limits etc).
I originally found issue 3 from the list above and tried to solve it in #3131, and this is how I found issue 4.
What can be done to avoid these bad situations?
runc init
(this is what runc run/create: refuse non-empty cgroup; runc exec: refuse frozen cgroup #3131 does).Surely, these measures might break some existing usage scenarios. This is why:
The text was updated successfully, but these errors were encountered: