Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
daemon: remove listener closing race
We use http.Server with a number of custom net.Listener instances. This patch makes some changes to the daemon.Stop(...) function. In order for the server instance to Shutdown() gracefully, the listener termination must be performed by the shutdown process itself. The server shutdown process is thread-safe, and tracks the listeners in use. Thread safety is important because the http.Server has two places where listeners can be terminated: 1. The thread hosting the Serve() method, using a deferred mutex protected closing function for all registered listeners. 2. The Shutdown() method (with mutex protection), which will close listeners async resuling in an error from the listener Accept(), but this special case is handled in Serve() and returns the ErrServerClosed error which means shutdown was done cleanly. However, in daemon.go we also pre-close all the listeners before the call to Shutdown(), which causes two kinds of errors: 1. accept tcp [::]:<port>: use of closed network connection This error is actually masked because of tomb.Kill() calls used for the termination of the daemon. See the daemon.Start() Serve() exit code. 2. close tcp [::]:<port>: use of closed network connection This is the actual race condition that can sometimes be seen in CI logs. This race is caused by the two closing mechanisms discussed above: The race is handled correctly by ensuring only one close() path can win (using sync.Do), but the problem is that the Serve() deferred Close() discards the error, while the Shutdown() Close() records the error. If The Shutdown() wins the Close(), we see an error. This race is only triggered because we manually close the listener first. A standalone simulation of the problem can be seen here: https://gist.github.com/flotter/483d233d38f0961a52f7ef7405887012 The following changes are introduced: - Remove the early manual listener closing code so its properly done by the Shutdown() call.
- Loading branch information