Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kvserver: improve MVCC range key latching #86647

Closed
erikgrinaker opened this issue Aug 23, 2022 · 2 comments
Closed

kvserver: improve MVCC range key latching #86647

erikgrinaker opened this issue Aug 23, 2022 · 2 comments
Assignees
Labels
C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)

Comments

@erikgrinaker
Copy link
Contributor

erikgrinaker commented Aug 23, 2022

Whenever we mutate MVCC range keys, we have to take out slightly wider read latches in order to peek around the range key for any existing range keys that will be fragmented or merged by the range key write, in order to update MVCC stats. Typically, this is done via rangeTombstonePeekBounds(), which will extend the bounds to start.Prevish() and end.Next(), but limited to the Raft range bounds (since we're not affected by range keys in other Raft ranges).

leftPeekBound, rightPeekBound := rangeTombstonePeekBounds(
args.Key, args.EndKey, desc.StartKey.AsRawKey(), desc.EndKey.AsRawKey())

However, when we declare the latches we don't have access to the range's end key, because it may change before evaluation by a concurrent merge -- we therefore have to pessimistically latch up to end.Next(), even though this may be a key in a neighboring range. Even though the peek bounds are tightened during evaluation to the actual range descriptor bounds, we're taking out a latch that may cross two ranges:

// NB: The range end key is not available, so this will pessimistically
// latch up to args.EndKey.Next(). If EndKey falls on the range end key, the
// span will be tightened during evaluation.
l, r := rangeTombstonePeekBounds(args.Key, args.EndKey, rs.GetStartKey().AsRawKey(), nil)
latchSpans.AddMVCC(spanset.SpanReadOnly, roachpb.Span{Key: l, EndKey: r}, header.Timestamp)

There is a related problem in that we also have to take out these write latches during range tombstone GC, which will contend with all other writers, even though we only need this latch for MVCC stats purposes, and only for other range key writers. See #86551 for a proposal to improve the GC latches specifically.

Instead, we should consider using a virtual unreplicated range-local keyspace that we could take out these range key peek latches across. For example, consider a key of the form:

/Local/RangeID/1000/u/RangeKeyStats/<key>

We could then declare a normal write latch between startKey and endKey, and additionally declare a write latch across /Local/RangeID/1000/u/RangeKeyStats/<startKey.Prevish> and /Local/RangeID/1000/u/RangeKeyStats/<endKey.Next>. All operations that mutate MVCC range keys must take out these latches (DeleteRange, ClearRange, RevertRange, AddSSTable, and EndTxn).

This would guarantee exclusion for requests within the same range, but not interfere with other ranges. Furthermore, GC could declare this latch when GCing range keys, which would serialize with all other operations that might mutate range keys, but not with point key writers at all.

Jira issue: CRDB-18852

Epic CRDB-2624

@erikgrinaker erikgrinaker added C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) T-kv-replication labels Aug 23, 2022
@blathers-crl
Copy link

blathers-crl bot commented Aug 23, 2022

cc @cockroachdb/replication

@erikgrinaker
Copy link
Contributor Author

@nvanbenschoten points out that we have separate concurrency manager per replica, so it's not possible for latches to content with other ranges.

concMgr: concurrency.NewManager(concurrency.Config{
NodeDesc: store.nodeDesc,
RangeDesc: desc,
Settings: store.ClusterSettings(),
DB: store.DB(),
Clock: store.Clock(),
Stopper: store.Stopper(),
IntentResolver: store.intentResolver,
TxnWaitMetrics: store.txnWaitMetrics,
SlowLatchGauge: store.metrics.SlowLatchRequests,
DisableTxnPushing: store.TestingKnobs().DontPushOnWriteIntentError,
TxnWaitKnobs: store.TestingKnobs().TxnWaitKnobs,
}),

In fact, we already take out latches beyond the range bounds:

// The spans may extend beyond this Range, but it's ok for the
// purpose of acquiring latches. The parts in our Range will
// be resolved eagerly.
for _, span := range et.LockSpans {
latchSpans.AddMVCC(spanset.SpanReadWrite, span, minTxnTS)
}

Let's just keep this simple then, and continue taking out peek read latches beyond the range bounds. This may unnecessarily contend with writers immediately adjacent, but that doesn't seem all that bad. We should document this though.

We should also implement the GC latch key approach in #86551.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
Projects
None yet
Development

No branches or pull requests

2 participants