Skip to content

Commit

Permalink
Merge #72085
Browse files Browse the repository at this point in the history
72085: kvserver: add MVCC-compliant `AddSSTable` variant r=dt,tbg,sumeerbhola,nvanbenschoten a=erikgrinaker

**storage: tweak `ScanIntents()`**

This renames `ScanIntents` from `ScanSeparatedIntents`, and adds a
context parameter. Callers have been updated to pass
`storage.mvcc.max_intents_per_error` for the intent limit, as is done
elsewhere, and `targetBytes` is passed as 0 for now (no limit) since
other intent collectors currently don't use a byte limit.

Touches #72563.

Release note: None

**roachpb: assert request flag combinations**

Request flags have implicit dependencies and incompatibilities (e.g.
`isLocking` implies `isTxn`). However, these were never checked and
developers were expected to satisfy them manually, which is error-prone.

This patch adds `TestFlagCombinations` that checks these dependencies
and incompatibilities, based on `flagDependencies` and `flagExclusions`
maps which encodes them.

It also adds a new `flag` type for flags, renames `skipLeaseCheck` to
`skipsLeaseCheck`, and adds `isAlone` for `CheckConsistencyRequest`.

Release note: None

**roachpb: add `appliesTSCache` request flag**

Previously, `isIntentWrite` determined whether a request checked the
timestamp cache and possibly pushed its timestamp. However, some
requests may want to check the timestamp cache without writing intents,
notably an MVCC-compliant `AddSSTable` request.

This patch introduces a new flag `appliesTSCache`, and uses it as a
condition for applying the timestamp cache to the request.

Release note: None

**kvserver: add MVCC-compliant `AddSSTable` variant**

`AddSSTable` does not comply with MVCC, the timestamp cache, nor the
closed timestamp, since the SST MVCC timestamps are written exactly as
given and thus can rewrite history.

This patch adds three new parameters that can make `AddSSTable` fully
MVCC-compliant, with a corresponding `MVCCAddSSTable` version gate:

* `WriteAtRequestTimestamp`: rewrites the MVCC timestamps to the request
  timestamp, complying with the timestamp cache and closed timestamp.

* `DisallowConflicts`: checks for any conflicting keys and intents at or
  above the SST's MVCC timestamps, complying with MVCC.

* `DisallowShadowingBelow`: implies `DisallowConflicts`, and also errors
  if shadowing visible keys (but not tombstones). Unlike the existing
  `DisallowShadowing`, this allows shadowing existing keys above the given
  timestamp if the new key has the same value as the existing one, and
  also allows idempotent writes at or above the given timestamp.

The existing `DisallowShadowing` parameter implies `DisallowConflicts`,
and also errors on any existing visible keys below the SST key's
timestamp (but not tombstones). It no longer allows replacing a
tombstone with a value at the exact same timestamp.

Additionally, even blind `AddSSTable` requests that do not check for
conflicts now take out lock spans and scan for existing intents,
returning a `WriteIntentError` to resolve them. This should be cheap
in the common case, since the caller is expected to ensure there are no
concurrent writes over the span, and so there should be no or few
intents.

The `WriteAtRequestTimestamp` SST rewrite implementation here is correct
but slow. Optimizations will be explored later.

Resolves #70422.

Release note: None

Co-authored-by: Erik Grinaker <[email protected]>
  • Loading branch information
craig[bot] and erikgrinaker committed Nov 25, 2021
2 parents 3749427 + 54f5746 commit ebd9bed
Show file tree
Hide file tree
Showing 43 changed files with 2,845 additions and 2,150 deletions.
2 changes: 1 addition & 1 deletion docs/generated/settings/settings-for-tenants.txt
Original file line number Diff line number Diff line change
Expand Up @@ -167,4 +167,4 @@ trace.debug.enable boolean false if set, traces for recent requests can be seen
trace.jaeger.agent string the address of a Jaeger agent to receive traces using the Jaeger UDP Thrift protocol, as <host>:<port>. If no port is specified, 6381 will be used.
trace.opentelemetry.collector string address of an OpenTelemetry trace collector to receive traces using the otel gRPC protocol, as <host>:<port>. If no port is specified, 4317 will be used.
trace.zipkin.collector string the address of a Zipkin instance to receive traces, as <host>:<port>. If no port is specified, 9411 will be used.
version version 21.2-14 set the active cluster version in the format '<major>.<minor>'
version version 21.2-16 set the active cluster version in the format '<major>.<minor>'
2 changes: 1 addition & 1 deletion docs/generated/settings/settings.html
Original file line number Diff line number Diff line change
Expand Up @@ -172,6 +172,6 @@
<tr><td><code>trace.jaeger.agent</code></td><td>string</td><td><code></code></td><td>the address of a Jaeger agent to receive traces using the Jaeger UDP Thrift protocol, as <host>:<port>. If no port is specified, 6381 will be used.</td></tr>
<tr><td><code>trace.opentelemetry.collector</code></td><td>string</td><td><code></code></td><td>address of an OpenTelemetry trace collector to receive traces using the otel gRPC protocol, as <host>:<port>. If no port is specified, 4317 will be used.</td></tr>
<tr><td><code>trace.zipkin.collector</code></td><td>string</td><td><code></code></td><td>the address of a Zipkin instance to receive traces, as <host>:<port>. If no port is specified, 9411 will be used.</td></tr>
<tr><td><code>version</code></td><td>version</td><td><code>21.2-14</code></td><td>set the active cluster version in the format '<major>.<minor>'</td></tr>
<tr><td><code>version</code></td><td>version</td><td><code>21.2-16</code></td><td>set the active cluster version in the format '<major>.<minor>'</td></tr>
</tbody>
</table>
6 changes: 4 additions & 2 deletions pkg/ccl/importccl/bench_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -135,8 +135,10 @@ func benchmarkAddSSTable(b *testing.B, dir string, tables []tableSSTable) {
b.StartTimer()
for _, t := range tables {
totalBytes += int64(len(t.sstData))
require.NoError(b, kvDB.AddSSTable(
ctx, t.span.Key, t.span.EndKey, t.sstData, true /* disallowShadowing */, nil /* stats */, false /*ingestAsWrites */, hlc.Timestamp{},
require.NoError(b, kvDB.AddSSTable(ctx, t.span.Key, t.span.EndKey, t.sstData,
false /* disallowConflicts */, true, /* disallowShadowing */
hlc.Timestamp{} /* disallowShadowingBelow */, nil, /* stats */
false /*ingestAsWrites */, hlc.Timestamp{}, false, /* writeAtBatchTS */
))
}
b.StopTimer()
Expand Down
7 changes: 7 additions & 0 deletions pkg/clusterversion/cockroach_versions.go
Original file line number Diff line number Diff line change
Expand Up @@ -288,6 +288,9 @@ const (
// system.statement_diagnostics_requests table to support collecting stmt
// bundles when the query latency exceeds the user provided threshold.
AlterSystemStmtDiagReqs
// MVCCAddSSTable supports MVCC-compliant AddSSTable requests via the new
// WriteAtRequestTimestamp and DisallowConflicts parameters.
MVCCAddSSTable

// *************************************************
// Step (1): Add new versions here.
Expand Down Expand Up @@ -497,6 +500,10 @@ var versionsSingleton = keyedVersions{
Key: AlterSystemStmtDiagReqs,
Version: roachpb.Version{Major: 21, Minor: 2, Internal: 14},
},
{
Key: MVCCAddSSTable,
Version: roachpb.Version{Major: 21, Minor: 2, Internal: 16},
},

// *************************************************
// Step (2): Add new versions here.
Expand Down
5 changes: 3 additions & 2 deletions pkg/clusterversion/key_string.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

14 changes: 10 additions & 4 deletions pkg/kv/batch.go
Original file line number Diff line number Diff line change
Expand Up @@ -764,9 +764,12 @@ func (b *Batch) adminRelocateRange(
func (b *Batch) addSSTable(
s, e interface{},
data []byte,
disallowConflicts bool,
disallowShadowing bool,
disallowShadowingBelow hlc.Timestamp,
stats *enginepb.MVCCStats,
ingestAsWrites bool,
writeAtRequestTimestamp bool,
) {
begin, err := marshalKey(s)
if err != nil {
Expand All @@ -783,10 +786,13 @@ func (b *Batch) addSSTable(
Key: begin,
EndKey: end,
},
Data: data,
DisallowShadowing: disallowShadowing,
MVCCStats: stats,
IngestAsWrites: ingestAsWrites,
Data: data,
DisallowConflicts: disallowConflicts,
DisallowShadowing: disallowShadowing,
DisallowShadowingBelow: disallowShadowingBelow,
MVCCStats: stats,
IngestAsWrites: ingestAsWrites,
WriteAtRequestTimestamp: writeAtRequestTimestamp,
}
b.appendReqs(req)
b.initResult(1, 0, notRaw, nil)
Expand Down
7 changes: 6 additions & 1 deletion pkg/kv/bulk/sst_batcher.go
Original file line number Diff line number Diff line change
Expand Up @@ -385,10 +385,13 @@ type SSTSender interface {
ctx context.Context,
begin, end interface{},
data []byte,
disallowConflicts bool,
disallowShadowing bool,
disallowShadowingBelow hlc.Timestamp,
stats *enginepb.MVCCStats,
ingestAsWrites bool,
batchTs hlc.Timestamp,
writeAtBatchTs bool,
) error
SplitAndScatter(ctx context.Context, key roachpb.Key, expirationTime hlc.Timestamp) error
}
Expand Down Expand Up @@ -457,7 +460,9 @@ func AddSSTable(
ingestAsWriteBatch = true
}
// This will fail if the range has split but we'll check for that below.
err = db.AddSSTable(ctx, item.start, item.end, item.sstBytes, item.disallowShadowing, &item.stats, ingestAsWriteBatch, batchTs)
err = db.AddSSTable(ctx, item.start, item.end, item.sstBytes, false, /* disallowConflicts */
item.disallowShadowing, hlc.Timestamp{} /* disallowShadowingBelow */, &item.stats,
ingestAsWriteBatch, batchTs, false /* writeAtBatchTs */)
if err == nil {
log.VEventf(ctx, 3, "adding %s AddSSTable [%s,%s) took %v", sz(len(item.sstBytes)), item.start, item.end, timeutil.Since(before))
return nil
Expand Down
3 changes: 3 additions & 0 deletions pkg/kv/bulk/sst_batcher_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -260,10 +260,13 @@ func (m mockSender) AddSSTable(
ctx context.Context,
begin, end interface{},
data []byte,
disallowConflicts bool,
disallowShadowing bool,
disallowShadowingBelow hlc.Timestamp,
_ *enginepb.MVCCStats,
ingestAsWrites bool,
batchTS hlc.Timestamp,
writeAtBatchTS bool,
) error {
return m(roachpb.Span{Key: begin.(roachpb.Key), EndKey: end.(roachpb.Key)})
}
Expand Down
9 changes: 8 additions & 1 deletion pkg/kv/db.go
Original file line number Diff line number Diff line change
Expand Up @@ -664,17 +664,24 @@ func (db *DB) AdminRelocateRange(

// AddSSTable links a file into the RocksDB log-structured merge-tree. Existing
// data in the range is cleared.
//
// The disallowConflicts, disallowShadowingBelow, and writeAtBatchTs parameters
// require the MVCCAddSSTable version gate, as they are new in 22.1.
func (db *DB) AddSSTable(
ctx context.Context,
begin, end interface{},
data []byte,
disallowConflicts bool,
disallowShadowing bool,
disallowShadowingBelow hlc.Timestamp,
stats *enginepb.MVCCStats,
ingestAsWrites bool,
batchTs hlc.Timestamp,
writeAtBatchTs bool,
) error {
b := &Batch{Header: roachpb.Header{Timestamp: batchTs}}
b.addSSTable(begin, end, data, disallowShadowing, stats, ingestAsWrites)
b.addSSTable(begin, end, data, disallowConflicts, disallowShadowing, disallowShadowingBelow,
stats, ingestAsWrites, writeAtBatchTs)
return getOneErr(db.Run(ctx, b), b)
}

Expand Down
1 change: 0 additions & 1 deletion pkg/kv/kvserver/batcheval/BUILD.bazel
Original file line number Diff line number Diff line change
Expand Up @@ -146,7 +146,6 @@ go_test(
"//pkg/util/uint128",
"//pkg/util/uuid",
"@com_github_cockroachdb_errors//:errors",
"@com_github_kr_pretty//:pretty",
"@com_github_stretchr_testify//assert",
"@com_github_stretchr_testify//require",
],
Expand Down
Loading

0 comments on commit ebd9bed

Please sign in to comment.