-
Notifications
You must be signed in to change notification settings - Fork 30
Docker does not catch container exit #2306
Comments
could be the same thing here
PLEG is going crazy because it can not connect to a container that does not exists anymore
docker info
|
@Deshke You can make sure that is is the same problem by running:
If your terminal just hangs, you have the same problem. |
For reference, I tried to reproduce the ubuntu-bash-exit hang on all current versions of docker across stable (17.09.0-ce), beta (17.09.01-ce) and alpha (17.11.0-ce) without luck so far, so there may be additional environmental factors triggering this (or increasing race chances). However, the original report on Debian suggests that this is a generic docker upstream issue which is better triaged on moby tracker, if you have additional details please followup at moby/moby#33820. I'm keeping this ticket open to track future resolution status in CL channels. |
Thanks for having a look @lucab . I think the thing we have in common is running lots of docker containers on the same host (and using Kubernetes). Maybe in such an environment the issue becomes more frequent? Anyway, we currently have rolled back to 1.12.06 and the issue disappeared with the same version of Container Linux. |
@lucab - I couldn't reproduce it either manually. CoreOS Beta is affected too, I'm going put a few nodes on Alpha and will report back with the results. |
@Raffo can reproduce on a instance that already has a zombie container running. So far i can reproduce this with the current stable, beta and alpha image. on the alpha image it currently takes a day until docker is unresponsive |
We are seeing this issue as well which then causes PLEG issues and finally general k8s cluster instability. Going to revert to 1.12.06, hopefully that will solve the problem for now. |
Same problem here. Reverting to 1.12.06 solved the PLEG issue. |
A runc race has been recently fixed via opencontainers/runc#1698, and that has been backported to docker 17.12.1 which we are currently shipping in beta and stable channels. I suspect it may be related to this bug and thus fixing it, but I have no way to verify that. It would be good if anybody previously affected by this could check if the issue is still present with docker 17.12.1. |
I agree with @lucab's assessment that this may be that runc issue that should be fixed in all current channels with docker-ce To confirm whether that's the bug you're encountering, once dockerd has hung, send it and containerd a If that's the bug, dockerd's stack trace should include ones similar to those here. @chrisferry since you saw an issue similar to this on |
Per my previous comment, I'm closing this with the hope it's fixed as of docker-ce 17.12.1, and thus fixed in all channels. If you still see this issue, let us know. |
Issue Report
Bug
Docker does not correctly catch the container exit.
The same issue is described on moby/moby#33820 . It's unclear at this stage if it is related to the docker build used in Container Linux which is the reason why I am opening this here.
Container Linux Version
Environment
AWS EC2 m4.large
Expected Behavior
Docker should correctly catch the container exit.
Actual Behavior
Docker does not catch the container exit: an exited container cannot be found in the process try while it is still visible via
docker ps
.Reproduction Steps
Once the problem happens:
Other Information
Docker version:
docker info:
The text was updated successfully, but these errors were encountered: