Skip to content

Commit

Permalink
Merge #108401
Browse files Browse the repository at this point in the history
108401: server: avoid a race in the server controller r=lidorcarmel a=knz

Co-authored with `@lidorcarmel` 

Prior to this patch, the Go race detector was complaining about racy concurrent accesses to `serverStateUsingChannels.server`, via `(*serverStateUsingChannels) getServer()` and `(*channelOrchestrator) startControlledServer()`.

These racy accesses happened to be safe because the writes and reads to that field were correctly ordered around updates to an atomic bool (the `started` field). However, the race detector is not sufficiently sophisticated to detect this ordering and satisfy itself that the state transition can only happen once.

In order to silence the race detector (with no change in correctness), this patch replaces the atomic bool by a mutex, whose access semantics are properly understood by the race detector.

This change incidentally makes the code slightly easier to read and understand.

Supersedes #108371.
Fixes #107930.
Epic: CRDB-28893

Co-authored-by: Raphael 'kena' Poss <[email protected]>
  • Loading branch information
craig[bot] and knz committed Aug 9, 2023
2 parents 6aa4615 + 26d007b commit 01a6aab
Showing 1 changed file with 22 additions and 12 deletions.
34 changes: 22 additions & 12 deletions pkg/server/server_controller_channel_orchestrator.go
Original file line number Diff line number Diff line change
Expand Up @@ -74,8 +74,14 @@ type serverStateUsingChannels struct {
// features that label data with the current tenant name.
nc *roachpb.TenantNameContainer

// server is the server that is being controlled.
server orchestratedServer
startedMu struct {
syncutil.Mutex

// server is the server that is being controlled.
// This is only set when the corresponding orchestratedServer
// instance is ready.
server orchestratedServer
}

// startedOrStopped is closed when the server has either started or
// stopped. This can be used to wait for a server start.
Expand All @@ -85,10 +91,6 @@ type serverStateUsingChannels struct {
// during server creation if any.
startErr error

// started is marked true when the server has started. This can
// be used to observe the start state without waiting.
started syncutil.AtomicBool

// requestImmediateStop can be called to request a server to stop
// ungracefully.
// It can be called multiple times.
Expand All @@ -106,7 +108,9 @@ var _ serverState = (*serverStateUsingChannels)(nil)

// getServer is part of the serverState interface.
func (s *serverStateUsingChannels) getServer() (orchestratedServer, bool) {
return s.server, s.started.Get()
s.startedMu.Lock()
defer s.startedMu.Unlock()
return s.startedMu.server, s.startedMu.server != nil
}

// nameContainer is part of the serverState interface.
Expand Down Expand Up @@ -154,14 +158,13 @@ func (o *channelOrchestrator) makeServerStateForSystemTenant(
closeCtx, cancelFn := context.WithCancel(context.Background())
st := &serverStateUsingChannels{
nc: nc,
server: systemSrv,
startedOrStoppedCh: closedChan,
requestImmediateStop: cancelFn,
requestGracefulStop: cancelFn,
stoppedCh: closeCtx.Done(),
}

st.started.Set(true)
st.startedMu.server = systemSrv
return st
}

Expand Down Expand Up @@ -301,7 +304,11 @@ func (o *channelOrchestrator) startControlledServer(
// the goroutine above to call tenantStopper.Stop() and
// terminate.
state.requestImmediateStop()
state.started.Set(false)
func() {
state.startedMu.Lock()
defer state.startedMu.Unlock()
state.startedMu.server = nil
}()
close(stoppedCh)
if !startedOrStoppedChAlreadyClosed {
state.startErr = errors.New("server stop before successful start")
Expand Down Expand Up @@ -413,9 +420,12 @@ func (o *channelOrchestrator) startControlledServer(
}

// Indicate the server has started.
state.server = tenantServer
startedOrStoppedChAlreadyClosed = true
state.started.Set(true)
func() {
state.startedMu.Lock()
defer state.startedMu.Unlock()
state.startedMu.server = tenantServer
}()
close(startedOrStoppedCh)

// Wait for a request to shut down.
Expand Down

0 comments on commit 01a6aab

Please sign in to comment.