From 82dac98a9a46ed4ee6ace7afee00d246f21f7631 Mon Sep 17 00:00:00 2001 From: Dongsu Park Date: Fri, 2 Dec 2016 15:17:30 +0100 Subject: [PATCH] engine: fix a bug in engine being unreachable When gRPC turned on, TestScheduleMachineOf fails sometimes, as the engine becomes unreachable with the following error messages: ==== transport: http2Client. notifyError got notified that the client transport was broken EOF. ERROR registrymux.go:166: Retry to connect to new engine: dial tcp 172.18.1.1:50059: getsockopt: connection refused ERROR registrymux.go:166: Retry to connect to new engine: dial tcp 172.18.1.1:50059: getsockopt: connection refused ERROR registrymux.go:166: Retry to connect to new engine: dial tcp 172.18.1.1:50059: getsockopt: connection refused ==== This must have been a regression from commit ecb121a ("registry/rpc: use simpleBalancer instead of ClientConn.State()"). Remove the additional checking with IsRegistryReady, in order to avoid the occasional case of engine being unreachable. Fixes https://github.com/coreos/fleet/issues/1712 --- engine/rpcengine.go | 7 +------ 1 file changed, 1 insertion(+), 6 deletions(-) diff --git a/engine/rpcengine.go b/engine/rpcengine.go index 269a82e4c..2e8504578 100644 --- a/engine/rpcengine.go +++ b/engine/rpcengine.go @@ -110,12 +110,7 @@ func rpcAcquireLeadership(reg registry.Registry, lManager lease.Manager, machID return l } - // If reg is not ready, we have to give it an opportunity to steal lease - // below. Otherwise it could be blocked forever by the existing engine leader, - // which could cause gRPC registry to always fail when a leader already exists. - // Thus we return the existing leader, only if reg.IsRegistryReady() == true. - // TODO(dpark): refactor the entire function for better readability. - 20160908 - if (existing != nil && existing.Version() >= ver) && reg.IsRegistryReady() { + if existing != nil && existing.Version() >= ver { log.Debugf("Lease already held by Machine(%s) operating at acceptable version %d", existing.MachineID(), existing.Version()) return existing }