Skip to content

Commit

Permalink
kvserver: prevent build-up of abandoned consistency checks
Browse files Browse the repository at this point in the history
We've seen in the events leading up to cockroachdb#75448 that a build-up of
consistency check computations on a node can severely impact node
performance. This commit attempts to address the main source of
that, while re-working the code for easier maintainability.

The way the consistency checker works is by replicating a command through
Raft that, on each Replica, triggers an async checksum computations
the results of which the caller collects via `CollectChecksum` requests
addressed to each `Replica`.

If for any reason, the caller does *not* wait to collect the checksums
but instead moves on to run another consistency check (perhaps on
another Range), these inflight computations can build up over time.

This was the main issue in cockroachdb#75448; we were accidentally canceling the
context on the leaseholder "right away", failing the consistency check
(but leaving it running on all other replicas), and moving on to the
next Range.
As a result, some (but with spread out leaseholders, ultimately all)
Replicas ended up with dozens of consistency check computations,
starving I/O and CPU.  We "addressed" this by avoiding this errant ctx
cancellation (cockroachdb#75448 but longer-term cockroachdb#75656), but this isn't a holistic
fix yet.

In this commit, we make three main changes:

- give the inflight consistency check computations a clean API, which
  makes it much easier to understand "how it works".
- when returning from CollectChecksum (either on success or error,
  notably including context cancellation), cancel the corresponding
  consistency check. This solves the problem, *assuming* that
  CollectChecksum is reliably issued to each Replica.
- reliably issue CollectChecksum to each Replica on which a computation
  may have been triggered. When the caller's context is canceled, still
  do the call with a one-second-timeout one-off Context which should be
  good enough to make it to the Replica and short-circuit the call.

Release note: None
  • Loading branch information
tbg committed Feb 22, 2022
1 parent 01a295e commit 322dd14
Show file tree
Hide file tree
Showing 7 changed files with 293 additions and 270 deletions.
2 changes: 1 addition & 1 deletion pkg/kv/kvserver/consistency_queue_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -231,7 +231,7 @@ func TestCheckConsistencyInconsistent(t *testing.T) {
defer log.Scope(t).Close(t)

// This test prints a consistency checker diff, so it's
// good to make sure we're overly redacting said diff.
// good to make sure we're not overly redacting said diff.
defer log.TestingSetRedactable(true)()

// Test uses sticky registry to have persistent pebble state that could
Expand Down
5 changes: 2 additions & 3 deletions pkg/kv/kvserver/replica.go
Original file line number Diff line number Diff line change
Expand Up @@ -370,6 +370,8 @@ type Replica struct {
// timestamp, independent of the source.
sideTransportClosedTimestamp sidetransportAccess

checksumStorage checksumStorage

mu struct {
// Protects all fields in the mu struct.
syncutil.RWMutex
Expand Down Expand Up @@ -556,9 +558,6 @@ type Replica struct {
// live node will not lose leaseholdership.
lastUpdateTimes lastUpdateTimesMap

// Computed checksum at a snapshot UUID.
checksums map[uuid.UUID]replicaChecksum

// proposalQuota is the quota pool maintained by the lease holder where
// incoming writes acquire quota from a fixed quota pool before going
// through. If there is no quota available, the write is throttled
Expand Down
Loading

0 comments on commit 322dd14

Please sign in to comment.