Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
This patch reduces the network timeout from 3 seconds to 2 seconds. This change also affects gRPC keepalive intervals/timeouts. Furthermore, the RPC heartbeat interval is now reduced to half of the network timeout (from 3 seconds to 1 second), with a timeout equal to the network timeout (from 6 seconds to 2 seconds). The gRPC dial timeout is also reduced to the network timeout (from 5 seconds to 2 seconds). When a peer is unresponsive, these timeouts determine how quickly RPC calls (and thus critical operations such as lease acquisitions) will be retried against a different node. Reducing them therefore improves recovery time during infrastructure outages. An environment variable `COCKROACH_NETWORK_TIMEOUT` has been introduced to tweak this timeout if needed. Release note (ops change): The network timeout for RPC connections between cluster nodes has been reduced from 3 seconds to 2 seconds, in order to reduce unavailability and tail latencies during infrastructure outages. This can now be changed via the environment variable `COCKROACH_NETWORK_TIMEOUT` which is set to `2s`.
- Loading branch information