-
Notifications
You must be signed in to change notification settings - Fork 17.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
runtime: Docker daemon on Windows using 1.7 beta2 can deadlock all goroutines #16286
Comments
/cc @randall77 |
Hi @jhowardmsft (or @alexbrainman, if you can repro), is it possible for you to get a traceback or some other form of debug dump from the deadlocked process? With that, this may be easy to track down; without it, it's going to be extremely hard. (BTW, I'm out of office this week, so I may be slow to respond.) |
Happy to help, but how can I get a traceback/debug dump? I can dump the process from task manager in Windows if that is sufficient. But if there's golang utilities for this, then if you can provide a pointer I can run them. |
How exactly are you calling into Windows C code, and how are you calling back from C code? Do you ever use |
@ianlancetaylor I don't believe we use The golang interface between docker and Windows is all through https://github.com/Microsoft/hcsshim. |
I see no use of |
I don't believe |
|
Thanks for all the info. Windows is different than Unix in that a call to Anyhow, back to this issue. Can you find out whether it fixes the problem if you set the environment variable |
@ianlancetaylor Yes, the deadlock doesn't not seem to happen with
|
I think I figured it out. At least, I can recreate the same symptoms. I don't think it's Windows-specific and I don't think it has anything to do with callbacks. I think it's simply that shrinkstack doesn't correctly handle the case of a select statement with the same channel in multiple cases. Will send CL shortly. |
@jhowardmsft can you try https://golang.org/cl/24815 to see if it fixes the problem? Thanks. |
CL https://golang.org/cl/24815 mentions this issue. |
It looks like 276b177 introduces a case where an application can become completely deadlocked. This was found through moby/moby#23235 in an attempt to verify that docker can be upgraded to golang 1.7 successfully
@aclements @runcom @alexbrainman @jstarks
Please answer these questions before submitting your issue. Thanks!
go version
)?go 1.7 beta2, and through git bisect working back to commit 276b177.
go env
)?If possible, provide a recipe for reproducing the error.
A complete runnable program is good.
A link on play.golang.org is best.
I wish this were easier than it were, but running docker CI against binaries built against the above versions of golang. This also requires Windows Server 2016 builds more recent than the public TP5. I was specifically running on build 14375. The reason for newer builds is that TP5 does not support the newer APIs needed by docker, so we use older APIs in Windows. Post TP5, we make extensive use of callback APIs from C code in Windows to golang and make use of golang channels for callbacks. This appears to line up with the changes in 276b177
It was found by running the CLI test
TestRestartContainerwithRestartPolicy
, although I've seen it fail on other tests too. The most reliable way of repro was starting the test, killing the daemon 5 or 6 seconds after containers have been started, then start cycle a few times through starting the daemon, seeing if it deadlocks, if not, killing it and restarting it again.No deadlock
Docker daemon completely locks up. Even an added goroutine which prints to the console every 100ms no longer makes forward progress.
The text was updated successfully, but these errors were encountered: