Skip to content

Commit

Permalink
rpc/nodedialer: reset conn breaker after succesful connection
Browse files Browse the repository at this point in the history
`Dialer.DialInternalClient` does not check the circuit breaker but
blindly attempts a connection and can succeed, leaving the system in a
state where there is a healthy connection to a node, but the circuit
breaker used for dialing is open. DistSQL checks for connection health
when scheduling processors, but the connection health check does not
examine the breaker. So DistSQL will proceed to schedule a processor on
a node but then be unable to use the connection to that node because
`Dialer.Dial` will return with a `breaker open` error. The code contains
a TODO to reconcile the handling of circuit breakers in the various
`Dialer` methods, but changing the handling is risky in the short
term. As a stop-gap, we reset the breaker after a connection is
successfully opened.

Fixes cockroachdb#29149

Release note: None
  • Loading branch information
petermattis committed Oct 5, 2018
1 parent 041f6bd commit 03cc365
Showing 1 changed file with 7 additions and 2 deletions.
9 changes: 7 additions & 2 deletions pkg/rpc/nodedialer/nodedialer.go
Original file line number Diff line number Diff line change
Expand Up @@ -144,8 +144,13 @@ func (n *Dialer) DialInternalClient(
if err != nil {
return nil, nil, err
}
// TODO(bdarnell): Reconcile the different health checks and circuit
// breaker behavior in this file
// TODO(bdarnell): Reconcile the different health checks and circuit breaker
// behavior in this file. Note that this different behavior causes problems
// for higher-levels in the system. For example, DistSQL checks for
// ConnHealth when scheduling processors, but can then see attempts to send
// RPCs fail when dial fails due to an open breaker. Reset the breaker here
// as a stop-gap before the reconciliation occurs.
n.getBreaker(nodeID).Reset()
if err := grpcutil.ConnectionReady(conn); err != nil {
return nil, nil, err
}
Expand Down

0 comments on commit 03cc365

Please sign in to comment.