Skip to content

Commit

Permalink
base: reduce network timeouts
Browse files Browse the repository at this point in the history
This patch reduces the network timeout from 3 seconds to 2 seconds. This
change also affects gRPC keepalive intervals/timeouts (3 to 2 seconds),
RPC heartbeats and timeouts (3 to 2 seconds), and the gRPC dial timeout
(6 to 4 seconds).

When a peer is unresponsive, these timeouts determine how quickly RPC
calls (and thus critical operations such as lease acquisitions) will be
retried against a different node. Reducing them therefore improves
recovery time during infrastructure outages.

An environment variable `COCKROACH_NETWORK_TIMEOUT` has been introduced
to tweak this timeout if needed.

Release note (ops change): The network timeout for RPC connections
between cluster nodes has been reduced from 3 seconds to 2 seconds, with
a connection timeout of 4 seconds, in order to reduce unavailability and
tail latencies during infrastructure outages. This can now be changed
via the environment variable `COCKROACH_NETWORK_TIMEOUT` which defaults
to `2s`.
  • Loading branch information
erikgrinaker committed Dec 4, 2022
1 parent 00f22fe commit 65a2bc3
Show file tree
Hide file tree
Showing 2 changed files with 36 additions and 13 deletions.
46 changes: 36 additions & 10 deletions pkg/base/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -47,9 +47,6 @@ const (
defaultSQLAddr = ":" + DefaultPort
defaultHTTPAddr = ":" + DefaultHTTPPort

// NetworkTimeout is the timeout used for network operations.
NetworkTimeout = 3 * time.Second

// defaultRaftTickInterval is the default resolution of the Raft timer.
defaultRaftTickInterval = 200 * time.Millisecond

Expand All @@ -66,10 +63,6 @@ const (
// each heartbeat.
defaultRaftHeartbeatIntervalTicks = 5

// defaultRPCHeartbeatIntervalAndTimeout is the default value of
// RPCHeartbeatIntervalAndTimeout used by the rpc context.
defaultRPCHeartbeatIntervalAndTimeout = 3 * time.Second

// defaultRangeLeaseRenewalFraction specifies what fraction the range lease
// renewal duration should be of the range lease active time. For example,
// with a value of 0.2 and a lease duration of 10 seconds, leases would be
Expand Down Expand Up @@ -118,11 +111,44 @@ func DefaultHistogramWindowInterval() time.Duration {
}

var (
// DialTimeout is the timeout used when dialing nodes. For gRPC, this is 3
// roundtrips (TCP + TLS handshake), so we set it to twice the network
// timeout.
// NetworkTimeout is the timeout used for network operations that require a
// single network round trip. It is conservatively defined as one maximum
// network round trip time (RTT) plus one TCP packet retransmit (RTO), then
// multiplied by 2 as a safety margin.
//
// The maximum RTT between cloud regions is roughly 350ms both in GCP
// (asia-south2 to southamerica-west1) and AWS (af-south-1 to sa-east-1). It
// can occasionally be up to 500ms, but 400ms is a reasonable upper bound
// under nominal conditions.
// https://datastudio.google.com/reporting/fc733b10-9744-4a72-a502-92290f608571/page/70YCB
// https://www.cloudping.co/grid/p_99/timeframe/1W
//
// Linux has an RTT-dependant retransmission timeout (RTO) which we can
// approximate as 1.5x RTT (smoothed RTT + 4x RTT variance), with a lower
// bound of 200ms. Under nominal conditions, this is approximately 600ms.
//
// The maximum p99 RPC heartbeat latency in any Cockroach Cloud cluster over a
// 90-day period was 557ms. This was a single-region US cluster, where the
// high latency appeared to be due to CPU overload or throttling: the cluster
// had 2 vCPU nodes running at 100%.
//
// The NetworkTimeout is thus set to 2 * (400ms + 600ms) = 2s.
//
// TODO(erikgrinaker): Consider reducing this to 1 second, which should be
// sufficient but may be fragile under latency fluctuations.
NetworkTimeout = envutil.EnvOrDefaultDuration("COCKROACH_NETWORK_TIMEOUT", 2*time.Second)

// DialTimeout is the timeout used when dialing a node. gRPC connections take
// up to 3 roundtrips for the TCP + TLS handshakes. Because NetworkTimeout
// allows for both a network roundtrip (RTT) and a TCP retransmit (RTO), with
// the RTO being greater than the RTT, and we don't need to tolerate more than
// 1 retransmit per connection attempt, 2 * NetworkTimeout is sufficient.
DialTimeout = 2 * NetworkTimeout

// defaultRPCHeartbeatIntervalAndTimeout is the default value of
// RPCHeartbeatIntervalAndTimeout used by the RPC context.
defaultRPCHeartbeatIntervalAndTimeout = NetworkTimeout

// defaultRaftElectionTimeoutTicks specifies the number of Raft Tick
// invocations that must pass between elections.
defaultRaftElectionTimeoutTicks = envutil.EnvOrDefaultInt(
Expand Down
3 changes: 0 additions & 3 deletions pkg/security/tls_settings.go
Original file line number Diff line number Diff line change
Expand Up @@ -40,9 +40,6 @@ var ocspMode = settings.RegisterEnumSetting(
"and in lax mode all certificates will be accepted.",
"off", map[int64]string{ocspOff: "off", ocspLax: "lax", ocspStrict: "strict"}).WithPublic()

// TODO(bdarnell): 3 seconds is the same as base.NetworkTimeout, but
// we can't use it here due to import cycles. We need a real
// no-dependencies base package for constants like this.
var ocspTimeout = settings.RegisterDurationSetting(
settings.TenantWritable, "security.ocsp.timeout",
"timeout before considering the OCSP server unreachable",
Expand Down

0 comments on commit 65a2bc3

Please sign in to comment.