kvserver: improve MVCC range key latching #86647

erikgrinaker · 2022-08-23T12:09:12Z

Whenever we mutate MVCC range keys, we have to take out slightly wider read latches in order to peek around the range key for any existing range keys that will be fragmented or merged by the range key write, in order to update MVCC stats. Typically, this is done via rangeTombstonePeekBounds(), which will extend the bounds to start.Prevish() and end.Next(), but limited to the Raft range bounds (since we're not affected by range keys in other Raft ranges).

cockroach/pkg/kv/kvserver/batcheval/cmd_delete_range.go

Lines 93 to 94 in 6dba994

    
           leftPeekBound, rightPeekBound := rangeTombstonePeekBounds( 
        
           	args.Key, args.EndKey, desc.StartKey.AsRawKey(), desc.EndKey.AsRawKey())

However, when we declare the latches we don't have access to the range's end key, because it may change before evaluation by a concurrent merge -- we therefore have to pessimistically latch up to end.Next(), even though this may be a key in a neighboring range. Even though the peek bounds are tightened during evaluation to the actual range descriptor bounds, we're taking out a latch that may cross two ranges:

cockroach/pkg/kv/kvserver/batcheval/cmd_delete_range.go

Lines 49 to 53 in 6dba994

    
           // NB: The range end key is not available, so this will pessimistically 
        
           // latch up to args.EndKey.Next(). If EndKey falls on the range end key, the 
        
           // span will be tightened during evaluation. 
        
           l, r := rangeTombstonePeekBounds(args.Key, args.EndKey, rs.GetStartKey().AsRawKey(), nil) 
        
           latchSpans.AddMVCC(spanset.SpanReadOnly, roachpb.Span{Key: l, EndKey: r}, header.Timestamp)

There is a related problem in that we also have to take out these write latches during range tombstone GC, which will contend with all other writers, even though we only need this latch for MVCC stats purposes, and only for other range key writers. See #86551 for a proposal to improve the GC latches specifically.

Instead, we should consider using a virtual unreplicated range-local keyspace that we could take out these range key peek latches across. For example, consider a key of the form:

/Local/RangeID/1000/u/RangeKeyStats/<key>

We could then declare a normal write latch between startKey and endKey, and additionally declare a write latch across /Local/RangeID/1000/u/RangeKeyStats/<startKey.Prevish> and /Local/RangeID/1000/u/RangeKeyStats/<endKey.Next>. All operations that mutate MVCC range keys must take out these latches (DeleteRange, ClearRange, RevertRange, AddSSTable, and EndTxn).

This would guarantee exclusion for requests within the same range, but not interfere with other ranges. Furthermore, GC could declare this latch when GCing range keys, which would serialize with all other operations that might mutate range keys, but not with point key writers at all.

Jira issue: CRDB-18852

Epic CRDB-2624

The text was updated successfully, but these errors were encountered:

blathers-crl · 2022-08-23T12:09:14Z

cc @cockroachdb/replication

erikgrinaker · 2022-08-23T17:13:56Z

@nvanbenschoten points out that we have separate concurrency manager per replica, so it's not possible for latches to content with other ranges.

cockroach/pkg/kv/kvserver/replica_init.go

Lines 79 to 91 in 0fea845

    
           concMgr: concurrency.NewManager(concurrency.Config{ 
        
           	NodeDesc:          store.nodeDesc, 
        
           	RangeDesc:         desc, 
        
           	Settings:          store.ClusterSettings(), 
        
           	DB:                store.DB(), 
        
           	Clock:             store.Clock(), 
        
           	Stopper:           store.Stopper(), 
        
           	IntentResolver:    store.intentResolver, 
        
           	TxnWaitMetrics:    store.txnWaitMetrics, 
        
           	SlowLatchGauge:    store.metrics.SlowLatchRequests, 
        
           	DisableTxnPushing: store.TestingKnobs().DontPushOnWriteIntentError, 
        
           	TxnWaitKnobs:      store.TestingKnobs().TxnWaitKnobs, 
        
           }),

In fact, we already take out latches beyond the range bounds:

cockroach/pkg/kv/kvserver/batcheval/cmd_end_transaction.go

Lines 94 to 99 in 681f951

    
           // The spans may extend beyond this Range, but it's ok for the 
        
           // purpose of acquiring latches. The parts in our Range will 
        
           // be resolved eagerly. 
        
           for _, span := range et.LockSpans { 
        
           	latchSpans.AddMVCC(spanset.SpanReadWrite, span, minTxnTS) 
        
           }

Let's just keep this simple then, and continue taking out peek read latches beyond the range bounds. This may unnecessarily contend with writers immediately adjacent, but that doesn't seem all that bad. We should document this though.

We should also implement the GC latch key approach in #86551.

erikgrinaker added C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) T-kv-replication labels Aug 23, 2022

erikgrinaker assigned aliher1911 Aug 23, 2022

erikgrinaker closed this as completed Aug 23, 2022

erikgrinaker mentioned this issue Aug 23, 2022

batcheval: document MVCC range tombstone latching #86680

Closed

aliher1911 mentioned this issue Aug 26, 2022

kvserver/batcheval: optimize range tombstone GC latches #84576

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

kvserver: improve MVCC range key latching #86647

kvserver: improve MVCC range key latching #86647

erikgrinaker commented Aug 23, 2022 •

edited by exalate-issue-sync bot

Loading

blathers-crl bot commented Aug 23, 2022

erikgrinaker commented Aug 23, 2022

kvserver: improve MVCC range key latching #86647

kvserver: improve MVCC range key latching #86647

Comments

erikgrinaker commented Aug 23, 2022 • edited by exalate-issue-sync bot Loading

blathers-crl bot commented Aug 23, 2022

erikgrinaker commented Aug 23, 2022

erikgrinaker commented Aug 23, 2022 •

edited by exalate-issue-sync bot

Loading