Skip to content

Commit

Permalink
Merge #63867
Browse files Browse the repository at this point in the history
63867: stop: break deadlock between Stopper.mu and Replica.mu r=irfansharif a=irfansharif

Fixes #63761. As of #61279, it was possible for us to deadlock due to
inconsistent lock orderings between Stopper.mu and Replica.mu. We were
previously holding onto Stopper.mu while executing all closers,
including those that may acquire other locks. Because closers can be
defined anywhere (and may consequently grab any in-scope lock), we
should order Stopper.mu to come after all other locks in the system.

The closer added in #61279 iterated over all non-destroyed replicas,
locking Replica.mu to check for the replica's destroy status (in
Store.VisitReplicas). This deadlocked with the lease acquisition code
path that first grabs an exclusive lock over Replica.mu (see
InitOrJoinRequest), and uses the stopper to kick off an async task
acquiring the lease. The stopper internally locks Stopper.mu to check
whether or not it was already stopped.

Release note: None

Co-authored-by: irfan sharif <[email protected]>
  • Loading branch information
craig[bot] and irfansharif committed Apr 20, 2021
2 parents f451dd0 + 79922e4 commit d094f02
Showing 1 changed file with 3 additions and 2 deletions.
5 changes: 3 additions & 2 deletions pkg/util/stop/stopper.go
Original file line number Diff line number Diff line change
Expand Up @@ -472,8 +472,9 @@ func (s *Stopper) Stop(ctx context.Context) {

s.Quiesce(ctx)

s.mu.Lock()
defer s.mu.Unlock()
// Run the closers without holding s.mu. There's no concern around new
// closers being added; we've marked this stopper as `stopping` above, so
// any attempts to do so will be refused.
for _, c := range s.mu.closers {
c.Close()
}
Expand Down

0 comments on commit d094f02

Please sign in to comment.