Fix for deadlock between stats() in serf and getBroadcasts() in memberlist #507
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Stats function can cause deadlock by holding on to the readLock on memberLock while attempting to read the length of broadcast queues. This is described in this issue.
This fix releases the memberLock before invoking the broadcast queues.
This is the stacktrace of the 2 goroutines that are stuck in the deadlock
GoRoutine 1 (waiting for memberlock.RLock in serf.go)
0 0x000000000042d6ec in runtime.gopark
at /goroot/src/runtime/proc.go:288
1 0x000000000042d7de in runtime.goparkunlock
at /goroot/src/runtime/proc.go:293
2 0x000000000043ed74 in runtime.semacquire1
at /goroot/src/runtime/sema.go:144
3 0x000000000043e999 in sync.runtime_Semacquire
at /goroot/src/runtime/sema.go:56
4 0x0000000000473b79 in sync.(*RWMutex).RLock
at /goroot/src/sync/rwmutex.go:50
5 0x00000000008fd216 in github.com/hashicorp/consul/vendor/github.com/hashicorp/serf/serf.(*Serf).NumNodes
at /gopath/src/github.com/hashicorp/consul/vendor/github.com/hashicorp/serf/serf/serf.go:1749
6 0x000000000090181a in github.com/hashicorp/consul/vendor/github.com/hashicorp/serf/serf.(*Serf).NumNodes-fm
at /gopath/src/github.com/hashicorp/consul/vendor/github.com/hashicorp/serf/serf/serf.go:342
7 0x00000000008da16c in github.com/hashicorp/consul/vendor/github.com/hashicorp/memberlist.(*TransmitLimitedQueue).GetBroadcasts
at /gopath/src/github.com/hashicorp/consul/vendor/github.com/hashicorp/memberlist/queue.go:80
8 0x00000000008e8df1 in github.com/hashicorp/consul/vendor/github.com/hashicorp/serf/serf.(*delegate).GetBroadcasts
at /gopath/src/github.com/hashicorp/consul/vendor/github.com/hashicorp/serf/serf/delegate.go:124
9 0x00000000008c9c39 in github.com/hashicorp/consul/vendor/github.com/hashicorp/memberlist.(*Memberlist).getBroadcasts
at /gopath/src/github.com/hashicorp/consul/vendor/github.com/hashicorp/memberlist/broadcast.go:88
10 0x00000000008de833 in github.com/hashicorp/consul/vendor/github.com/hashicorp/memberlist.(*Memberlist).gossip
at /gopath/src/github.com/hashicorp/consul/vendor/github.com/hashicorp/memberlist/state.go:511
11 0x00000000008e56fa in github.com/hashicorp/consul/vendor/github.com/hashicorp/memberlist.(*Memberlist).(github.com/hashicorp/consul/vendor/github.com/hashicorp/memberlist.gossip)-fm
at /gopath/src/github.com/hashicorp/consul/vendor/github.com/hashicorp/memberlist/state.go:111
12 0x00000000008dc18f in github.com/hashicorp/consul/vendor/github.com/hashicorp/memberlist.(*Memberlist).triggerFunc
at /gopath/src/github.com/hashicorp/consul/vendor/github.com/hashicorp/memberlist/state.go:135
13 0x000000000045ce61 in runtime.goexit
at /goroot/src/runtime/asm_amd64.s:2337
Goroutine 2 (waiting for q.Lock in queue.go)
0 0x000000000042d6ec in runtime.gopark
at /goroot/src/runtime/proc.go:288
1 0x000000000042d7de in runtime.goparkunlock
at /goroot/src/runtime/proc.go:293
2 0x000000000043ed74 in runtime.semacquire1
at /goroot/src/runtime/sema.go:144
3 0x000000000043ea8d in sync.runtime_SemacquireMutex
at /goroot/src/runtime/sema.go:71
4 0x0000000000472c5e in sync.(*Mutex).Lock
at /goroot/src/sync/mutex.go:134
5 0x00000000008da5c3 in github.com/hashicorp/consul/vendor/github.com/hashicorp/memberlist.(*TransmitLimitedQueue).NumQueued
at /gopath/src/github.com/hashicorp/consul/vendor/github.com/hashicorp/memberlist/queue.go:116
6 0x00000000008fc40b in github.com/hashicorp/consul/vendor/github.com/hashicorp/serf/serf.(*Serf).Stats
at /gopath/src/github.com/hashicorp/consul/vendor/github.com/hashicorp/serf/serf/serf.go:1682
7 0x00000000009c7c2d in github.com/hashicorp/consul/agent/consul.(*Client).Stats
at /gopath/src/github.com/hashicorp/consul/agent/consul/client.go:346
8 0x0000000000f2ceaf in github.com/hashicorp/consul/agent.(*Agent).Stats
at /gopath/src/github.com/hashicorp/consul/agent/agent.go:2034
9 0x0000000000f3235d in github.com/hashicorp/consul/agent.(*HTTPServer).AgentSelf
at /gopath/src/github.com/hashicorp/consul/agent/agent_endpoint.go:75
10 0x0000000000f72c3f in github.com/hashicorp/consul/agent.(*HTTPServer).handler.func2
at /gopath/src/github.com/hashicorp/consul/agent/http.go:104