kvserver: make blind AddSSTable
writes safe
#73047
Labels
A-kv-server
Relating to the KV-level RPC server
C-performance
Perf of queries or internals. Solution not expected to change functional behavior.
T-kv
KV Team
In #72085, we made
AddSSTable
enforce all CockroachDB invariants (e.g. MVCC, timestamp cache, closed timestamp) if the following parameters are given:WriteAtRequestTimestamp
: rewrites SST's MVCC timestamp to the request timestamp, to comply with timestamp cache and closed timestamp.DisallowConflicts
: scans for MVCC conflicts (keys with timestamps at or above the written key).However, the
DisallowConflicts
parameter is particularly expensive, since it requires doing a scan across the key span we're writing into. We should try to makeAddSSTable
enforce these invariants even without it.Since
AddSSTable
cannot be used in transactions, all of its writes are blind (i.e. the transaction hasn't observed any keys in the span before writing). This means that, givenWriteAtRequestTimestamp
respects the timestamp cache, it is not problematic to write below existing MVCC keys' timestamps (if those writers had read anything at relevant timestamps, it would have bumped the tscache). Thus, the only reason we need the scan is to avoid collisions with existing keys, where we replace the key at the exact same timestamp.If we had a way to obtain an MVCC timestamp that was guaranteed to be unused and that was at or in the near future of the request timestamp, then we could do these blind writes and guarantee safety without incurring the cost of the
DisallowConflicts
scan.Note that we cannot e.g. use the current HLC time, because intent resolution does not advance the HLC clock and can be resolved to future timestamps. There are also challenges with future writes due to non-blocking transactions.
Perhaps we could use a Pebble property collector or something to keep track of this.
Jira issue: CRDB-11395
The text was updated successfully, but these errors were encountered: