Skip to content

Commit

Permalink
gossip: don't resolve addresses while holding mutex
Browse files Browse the repository at this point in the history
This patch removes a DNS resolution call performed while holding the
gossip mutex. This can lead to severe process stalls if the DNS lookup
is not immediate, since we need to acquire gossip read locks in several
performance critical code paths, including Raft processing. However, the
DNS lookup was only done when validating a remote forwarding address,
which presumably happens fairly rarely. Removing it should not cause any
problems, since the address will necessarily be validated later when
attempting to connect to it.

Epic: none
Release note (bug fix): Fixed a bug where a DNS lookup was performed
during gossip remote forwarding while holding the gossip mutex. This
could cause processing stalls if the DNS server was slow to respond.
  • Loading branch information
erikgrinaker committed Jan 18, 2023
1 parent 4ec5a5f commit 0cda3fa
Showing 1 changed file with 0 additions and 7 deletions.
7 changes: 0 additions & 7 deletions pkg/gossip/client.go
Original file line number Diff line number Diff line change
Expand Up @@ -259,13 +259,6 @@ func (c *client) handleResponse(ctx context.Context, g *Gossip, reply *Response)
"received forward from n%d to n%d (%s); already have active connection, skipping",
reply.NodeID, reply.AlternateNodeID, reply.AlternateAddr)
}
// We try to resolve the address, but don't actually use the result.
// The certificates (if any) may only be valid for the unresolved
// address.
if _, err := reply.AlternateAddr.Resolve(); err != nil {
return errors.Wrapf(err, "unable to resolve alternate address %s for n%d",
reply.AlternateAddr, reply.AlternateNodeID)
}
c.forwardAddr = reply.AlternateAddr
return errors.Errorf("received forward from n%d to n%d (%s)",
reply.NodeID, reply.AlternateNodeID, reply.AlternateAddr)
Expand Down

0 comments on commit 0cda3fa

Please sign in to comment.