Skip to content

Commit

Permalink
rpc: use the loopback conn also for GRPCDialOptions
Browse files Browse the repository at this point in the history
For context, `rpc.GRPCDialOptions` is used in two cases:

- when connecting to other nodes as specified by the `--join` flag.
- in the grpc-gateway code, to route incoming HTTP requests to the RPC
  subsystem.

The first one nearly always targets remotes nodes. The second one
always targets the local node (it's a loopback connection).

Prior to this patch, the 2 callers to `rpc.GRPCDialOptions` would be
served the regular "remote network conn" dial options unconditionally,
including the backoff, only-once-dialer and other parameters suitable
to connect to other nodes remotely.

While this choice is suitable for the `--join` logic, it's not
suitable for the grpc-gateway loopback conn. In that case, we want to
avoid all the network intelligence and especially avoid the
only-once-dialer and circuit breaker.

This patch ensures that grpc-gateway receives the loopback parameters properly.

Release note (bug fix): A bug was fixed whereby under high CPU load,
HTTP requests to certain API endpoints (e.g. the health endpoint)
could start failing and then never succeed again until the node was
restarted. This bug had been introduced in v23.1.
  • Loading branch information
knz committed May 23, 2023
1 parent a1b8517 commit 877111d
Show file tree
Hide file tree
Showing 2 changed files with 23 additions and 1 deletion.
13 changes: 12 additions & 1 deletion pkg/rpc/context.go
Original file line number Diff line number Diff line change
Expand Up @@ -450,6 +450,12 @@ type Context struct {

// SetLoopbackDialer configures the loopback dialer function.
func (c *Context) SetLoopbackDialer(loopbackDialFn func(context.Context) (net.Conn, error)) {
if c.ContextOptions.Knobs.NoLoopbackDialer {
// A test has decided it is opting out of the special loopback
// dialing mechanism. Obey it. We already have defined
// loopbackDialFn in that case in NewContext().
return
}
c.loopbackDialFn = loopbackDialFn
}

Expand Down Expand Up @@ -1640,7 +1646,12 @@ const (
func (rpcCtx *Context) GRPCDialOptions(
ctx context.Context, target string, class ConnectionClass,
) ([]grpc.DialOption, error) {
return rpcCtx.grpcDialOptionsInternal(ctx, target, class, tcpTransport)
transport := tcpTransport
if rpcCtx.Config.AdvertiseAddr == target && !rpcCtx.ClientOnly {
// See the explanation on loopbackDialFn for an explanation about this.
transport = loopbackTransport
}
return rpcCtx.grpcDialOptionsInternal(ctx, target, class, transport)
}

// grpcDialOptions produces dial options suitable for connecting to the given target and class.
Expand Down
11 changes: 11 additions & 0 deletions pkg/server/connectivity_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -125,6 +125,17 @@ func TestClusterConnectivity(t *testing.T) {
// We're going to manually control initialization in this test.
NoAutoInitializeCluster: true,
StoreSpecs: []base.StoreSpec{{InMemory: true}},
Knobs: base.TestingKnobs{
Server: &server.TestingKnobs{
ContextTestingKnobs: rpc.ContextTestingKnobs{
// Disable the special RPC loopback dial logic.
// This is needed because the test starts to perform
// RPC dials before the PreStart() function is called,
// and that is where the loopback dial function is defined.
NoLoopbackDialer: true,
},
},
},
}

for i, test := range testConfigurations {
Expand Down

0 comments on commit 877111d

Please sign in to comment.