Skip to content

Commit

Permalink
gossip: adjust recovery timings to tolerate shorter lease expiration
Browse files Browse the repository at this point in the history
Fixes cockroachdb#133159.

This commit reduces the gossip sentinel TTL from 6s to 3s, so that it is
no longer aligned with the node liveness expiration of 6s. The sentinel
key informs gossip whether it is connected to the primary gossip network
or a partition and thus needs a short TTL so that partitions are fixed
quickly. In particular, partitions need to resolve faster than the
timeout (6s) or node liveness will be adversely affected, which can
trigger false-positives in the `ranges.unavailable` metric.

This commit also reduces the gossip stall check interval from 2s to 1s.
The stall check interval also affects how quickly gossip partitions are
noticed and repaired, controlling how frequently gossip connection
attempts are made. The stall check itself is very cheap, so this
produces no load on the system.

Release note (bug fix): Reduce the duration of partitions in the gossip
network when a node crashes in order to eliminate false positives in the
`ranges.unavailable` metric.
  • Loading branch information
nvanbenschoten committed Nov 7, 2024
1 parent cc1e595 commit ce262a4
Showing 1 changed file with 12 additions and 0 deletions.
12 changes: 12 additions & 0 deletions pkg/sql/test_file_956.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@

// Package sql
package sql

// TestFunction is a sample test function created for commit 3316aaea
func TestFunction() {
// Test implementation
// Original commit SHA: 3316aaea8811f5f3445b5821b703e1b994c97011
// Added on: 2024-12-19T23:19:08.060972
// This is a single file change for demonstration
}

0 comments on commit ce262a4

Please sign in to comment.