Skip to content

Commit

Permalink
Browse files Browse the repository at this point in the history
83213: kvserver: make MVCC GC less disruptive to foreground traffic r=aayushshah15 a=aayushshah15

This commit changes GC requests to no longer declare exclusive latches
at their BatchRequest's timestamp. This was already incorrect as
explained in #55293.

>The first use is broken because we acquire write latches at the batch
header's timestamp, which is set to time.Now(), so we're only
serializing with reads in the future and all other writes [1]. So we're
disruptive to everyone except who we want to serialize with – reads in
the past!

This commit makes GC requests only declare a non-mvcc exclusive latch
over the `RangeGCThresholdKey`. This is correct because:
```

// 1. We define "correctness" to be the property that a reader reading at /
// around the GC threshold will either see the correct results or receive an
// error.
// 2. Readers perform their command evaluation over a stable snapshot of the
// storage engine. This means that the reader will not see the effects of a
// subsequent GC run as long as it created a Pebble iterator before the GC
// request.
// 3. A reader checks the in-memory GC threshold of a Replica after it has
// created this snapshot (i.e. after a Pebble iterator has been created).
// 4. If the in-memory GC threshold is above the timestamp of the read, the
// reader receives an error. Otherwise, the reader is guaranteed to see a
// state of the storage engine that hasn't been affected by the GC request [5].
// 5. GC requests bump the in-memory GC threshold of a Replica as a pre-apply
// side effect. This means that if a reader checks the in-memory GC threshold
// after it has created a Pebble iterator, it is impossible for the iterator
// to point to a storage engine state that has been affected by the GC
// request.

```

As a result, GC requests should now be much less disruptive to
foreground traffic since they're no longer redundantly declaring
exclusive latches over global keys.

Resolves #55293

Release note(performance improvement): MVCC garbage collection should
now be much less disruptive to foreground traffic than before.


85231: backupccl: add RESTORE with schema_only r=dt a=msbutler

Fixes #83470

Release note (sql change): This pr adds the schema_only flag to RESTORE,
allowing a user to run a normal RESTORE, without restoring any user table data.
This can be used to quickly validate that a given backup is restorable. A
schema_only restore runtime is O(# of descriptors) which is a fraction of a
regular restore's runtime O(# of table rows).

Note that during a cluster level, schema_only restore, the system tables are
read from S3 and written to disk, as this provides important validation
coverage without much runtime cost (system tables should not be large).

After running a successful schema_only RESTORE, the user can revert the cluster
to its pre-restore state by simply dropping the descriptors the schema_only
restore added (e.g. if the user restored a database, they can drop the
database after the restore completes). Note that in the cluster level case, the
restored system data cannot be reverted, this shouldn't matter, as the cluster
was empty before hand.

For the Backup validation use case, RESTORE with schema_only provides near
total validation coverage. In other words, if a user's schema_only RESTORE
works, they can be quite confident that a real RESTORE will work. There's one
notable place schema_only RESTORE lacks coverage:

It doesn't read (or write) from any of the SSTs that store backed up user table
data. To ensure a Backup's SSTs are where the RESTORE cmd would expect them
to be, a user should run SHOW BACKUP ... with check_files. Further, in an
upcoming patch, another flag for RESTORE validation will be introduced --
the verify_backup_table_data flag -- which extends schema_only functionality
to read the table data from S3 and conduct checksums on it. Like with the
schema_only flag, no table data will be ingested into the cluster.

85695: colexec: add support for ILIKE and NOT ILIKE r=yuzefovich a=yuzefovich

**colexec: clean up NOT LIKE operator generation**

This commit cleans up the way we generate operators for NOT LIKE.
Previously, they would get their own copy which was exactly the same as
for LIKE with a difference in a single line, and now the same underlying
operator will handle both LIKE and NOT LIKE - the result of comparison
just needs to be negated. The performance hit of this extra boolean
comparison is negligible yet we can remove some of the duplicated
generated code.

```
name                                     old time/op    new time/op    delta
LikeOps/selPrefixBytesBytesConstOp-24      17.8µs ± 1%    16.9µs ± 0%   -4.93%  (p=0.000 n=10+10)
LikeOps/selSuffixBytesBytesConstOp-24      18.5µs ± 0%    18.7µs ± 0%   +1.37%  (p=0.000 n=10+10)
LikeOps/selContainsBytesBytesConstOp-24    27.8µs ± 0%    28.0µs ± 0%   +1.02%  (p=0.000 n=9+10)
LikeOps/selRegexpBytesBytesConstOp-24       479µs ± 1%     484µs ± 0%   +1.10%  (p=0.000 n=10+10)
LikeOps/selSkeletonBytesBytesConstOp-24    39.9µs ± 0%    40.3µs ± 0%   +0.85%  (p=0.000 n=10+10)
LikeOps/selRegexpSkeleton-24                871µs ± 2%     871µs ± 0%     ~     (p=1.000 n=10+10)
```

Release note: None

**colexec: add support for ILIKE and NOT ILIKE**

This commit adds the native vectorized support for ILIKE and NOT ILIKE
comparisons. The idea is simple - convert both the argument and the
pattern to capital letters. This required minor changes to the templates
to add a "prelude" step of that conversion as well as conversion of the
pattern to the upper case during planning.

Initially, I generated separate operators for case-insensitive cases,
but the benchmarks shown that the performance impact of a single
conditional inside of the `for` loop is barely noticeable given that the
branch prediction will always be right, so I refactored the existing
operators to support case insensitivity.

```
name                                     old time/op    new time/op    delta
LikeOps/selPrefixBytesBytesConstOp-24      16.8µs ± 0%    17.7µs ± 0%  +5.30%  (p=0.000 n=10+10)
LikeOps/selSuffixBytesBytesConstOp-24      18.7µs ± 0%    19.2µs ± 0%  +2.99%  (p=0.000 n=10+10)
LikeOps/selContainsBytesBytesConstOp-24    28.0µs ± 0%    27.8µs ± 0%  -0.73%  (p=0.000 n=10+10)
LikeOps/selRegexpBytesBytesConstOp-24       479µs ± 0%     480µs ± 0%  +0.33%  (p=0.008 n=9+10)
LikeOps/selSkeletonBytesBytesConstOp-24    40.2µs ± 0%    41.4µs ± 0%  +3.20%  (p=0.000 n=9+10)
LikeOps/selRegexpSkeleton-24                860µs ± 0%     857µs ± 0%  -0.36%  (p=0.023 n=10+10)
```

Addresses: #49781.

Release note (performance improvement): ILIKE and NOT ILIKE filters can
now be evaluated more efficiently in some cases.

85731: rowexec: allow ordered joinReader to stream matches to the first row r=DrewKimball a=DrewKimball

Currently the `joinReaderOrderingStrategy` implementation buffers all looked up
rows before matching them with input rows and emitting them. This is necessary
because the looked up rows may not be received in input order (which must be
maintained). However, rows that match the first input row can be emitted
immediately. In the case when there are many rows that match the first input
row, this can decrease overhead of the buffer. Additionally, this change can
allow a limit to be satisfied earlier, which can significantly decrease
latency. This is especially advantageous in the case when there is only one
input row, since all lookups can then be rendered and returned in streaming
fashion.

Release note (performance improvement): The execution engine can now
short-circuit execution of lookup joins in more cases, which can decrease
latency for queries with limits.

85809: ui: fix time window selection with mouse on Metrics charts r=koorosh a=koorosh

This patch fixes an issue that prevents proper time selection
with mouse on Metrics charts. The root cause of it is
updated time scale object didn't include correct value
of `windowSize` that's used to calculate `start` position
of time range.

Release note (ui change): fix issue with incorrect start time
position of selected time range on Metrics page.

Resolves: #84001

Co-authored-by: Aayush Shah <[email protected]>
Co-authored-by: Michael Butler <[email protected]>
Co-authored-by: Yahor Yuzefovich <[email protected]>
Co-authored-by: DrewKimball <[email protected]>
Co-authored-by: Andrii Vorobiov <[email protected]>
  • Loading branch information
6 people committed Aug 9, 2022
6 parents 6fe360e + 3e0270a + b811739 + 02b32a6 + d413fe7 + d46ba25 commit 7c18668
Show file tree
Hide file tree
Showing 40 changed files with 1,646 additions and 1,506 deletions.
2 changes: 2 additions & 0 deletions docs/generated/sql/bnf/stmt_block.bnf
Original file line number Diff line number Diff line change
Expand Up @@ -1266,6 +1266,7 @@ unreserved_keyword ::=
| 'RUNNING'
| 'SCHEDULE'
| 'SCHEDULES'
| 'SCHEMA_ONLY'
| 'SCROLL'
| 'SETTING'
| 'SETTINGS'
Expand Down Expand Up @@ -2492,6 +2493,7 @@ restore_options ::=
| 'NEW_DB_NAME' '=' string_or_placeholder
| 'INCREMENTAL_LOCATION' '=' string_or_placeholder_opt_list
| 'TENANT' '=' string_or_placeholder
| 'SCHEMA_ONLY'

scrub_option_list ::=
( scrub_option ) ( ( ',' scrub_option ) )*
Expand Down
4 changes: 4 additions & 0 deletions pkg/ccl/backupccl/backup_telemetry.go
Original file line number Diff line number Diff line change
Expand Up @@ -65,6 +65,7 @@ const (
telemetryOptionSkipMissingSequenceOwners = "skip_missing_sequence_owners"
telemetryOptionSkipMissingViews = "skip_missing_views"
telemetryOptionSkipLocalitiesCheck = "skip_localities_check"
telemetryOptionSchemaOnly = "schema_only"
)

// logBackupTelemetry publishes an eventpb.RecoveryEvent about a manually
Expand Down Expand Up @@ -397,6 +398,9 @@ func logRestoreTelemetry(
if opts.Detached {
options = append(options, telemetryOptionDetached)
}
if opts.SchemaOnly {
options = append(options, telemetryOptionSchemaOnly)
}
sort.Strings(options)

event := &eventpb.RecoveryEvent{
Expand Down
168 changes: 89 additions & 79 deletions pkg/ccl/backupccl/backup_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -2674,99 +2674,109 @@ func TestBackupRestoreDuringUserDefinedTypeChange(t *testing.T) {
}

for _, tc := range testCases {
t.Run(tc.name, func(t *testing.T) {
// Protects numTypeChangesStarted and numTypeChangesFinished.
var mu syncutil.Mutex
numTypeChangesStarted := 0
numTypeChangesFinished := 0
typeChangesStarted := make(chan struct{})
waitForBackup := make(chan struct{})
typeChangesFinished := make(chan struct{})
_, sqlDB, _, cleanupFn := backupRestoreTestSetupWithParams(t, singleNode, 0, InitManualReplication, base.TestClusterArgs{
ServerArgs: base.TestServerArgs{
Knobs: base.TestingKnobs{
SQLTypeSchemaChanger: &sql.TypeSchemaChangerTestingKnobs{
RunBeforeEnumMemberPromotion: func(context.Context) error {
mu.Lock()
if numTypeChangesStarted < len(tc.queries) {
numTypeChangesStarted++
if numTypeChangesStarted == len(tc.queries) {
close(typeChangesStarted)
for _, isSchemaOnly := range []bool{true, false} {
suffix := ""
if isSchemaOnly {
suffix = "-schema-only"
}
t.Run(tc.name+suffix, func(t *testing.T) {
// Protects numTypeChangesStarted and numTypeChangesFinished.
var mu syncutil.Mutex
numTypeChangesStarted := 0
numTypeChangesFinished := 0
typeChangesStarted := make(chan struct{})
waitForBackup := make(chan struct{})
typeChangesFinished := make(chan struct{})
_, sqlDB, _, cleanupFn := backupRestoreTestSetupWithParams(t, singleNode, 0, InitManualReplication, base.TestClusterArgs{
ServerArgs: base.TestServerArgs{
Knobs: base.TestingKnobs{
SQLTypeSchemaChanger: &sql.TypeSchemaChangerTestingKnobs{
RunBeforeEnumMemberPromotion: func(context.Context) error {
mu.Lock()
if numTypeChangesStarted < len(tc.queries) {
numTypeChangesStarted++
if numTypeChangesStarted == len(tc.queries) {
close(typeChangesStarted)
}
mu.Unlock()
<-waitForBackup
} else {
mu.Unlock()
}
mu.Unlock()
<-waitForBackup
} else {
mu.Unlock()
}
return nil
return nil
},
},
},
},
},
})
defer cleanupFn()
})
defer cleanupFn()

// Create a database with a type.
sqlDB.Exec(t, `
// Create a database with a type.
sqlDB.Exec(t, `
CREATE DATABASE d;
CREATE TYPE d.greeting AS ENUM ('hello', 'howdy', 'hi');
`)

// Start ALTER TYPE statement(s) that will block.
for _, query := range tc.queries {
go func(query string, totalQueries int) {
// Note we don't use sqlDB.Exec here because we can't Fatal from within a goroutine.
if _, err := sqlDB.DB.ExecContext(context.Background(), query); err != nil {
t.Error(err)
}
mu.Lock()
numTypeChangesFinished++
if numTypeChangesFinished == totalQueries {
close(typeChangesFinished)
}
mu.Unlock()
}(query, len(tc.queries))
}

// Wait on the type changes to start.
<-typeChangesStarted
// Start ALTER TYPE statement(s) that will block.
for _, query := range tc.queries {
go func(query string, totalQueries int) {
// Note we don't use sqlDB.Exec here because we can't Fatal from within a goroutine.
if _, err := sqlDB.DB.ExecContext(context.Background(), query); err != nil {
t.Error(err)
}
mu.Lock()
numTypeChangesFinished++
if numTypeChangesFinished == totalQueries {
close(typeChangesFinished)
}
mu.Unlock()
}(query, len(tc.queries))
}

// Now create a backup while the type change job is blocked so that
// greeting is backed up with some enum members in READ_ONLY state.
sqlDB.Exec(t, `BACKUP DATABASE d TO 'nodelocal://0/test/'`)
// Wait on the type changes to start.
<-typeChangesStarted

// Let the type change finish.
close(waitForBackup)
<-typeChangesFinished
// Now create a backup while the type change job is blocked so that
// greeting is backed up with some enum members in READ_ONLY state.
sqlDB.Exec(t, `BACKUP DATABASE d TO 'nodelocal://0/test/'`)

// Now drop the database and restore.
sqlDB.Exec(t, `DROP DATABASE d`)
sqlDB.Exec(t, `RESTORE DATABASE d FROM 'nodelocal://0/test/'`)
// Let the type change finish.
close(waitForBackup)
<-typeChangesFinished

// The type change job should be scheduled and finish. Note that we can't use
// sqlDB.CheckQueryResultsRetry as it Fatal's upon an error. The case below
// will error until the job completes.
for i, query := range tc.succeedAfter {
testutils.SucceedsSoon(t, func() error {
_, err := sqlDB.DB.ExecContext(context.Background(), query)
return err
})
sqlDB.CheckQueryResults(t, query, [][]string{{tc.expectedSuccess[i]}})
}
// Now drop the database and restore.
sqlDB.Exec(t, `DROP DATABASE d`)
restoreQuery := `RESTORE DATABASE d FROM 'nodelocal://0/test/'`
if isSchemaOnly {
restoreQuery = restoreQuery + " with schema_only"
}
sqlDB.Exec(t, restoreQuery)

// The type change job should be scheduled and finish. Note that we can't use
// sqlDB.CheckQueryResultsRetry as it Fatal's upon an error. The case below
// will error until the job completes.
for i, query := range tc.succeedAfter {
testutils.SucceedsSoon(t, func() error {
_, err := sqlDB.DB.ExecContext(context.Background(), query)
return err
})
sqlDB.CheckQueryResults(t, query, [][]string{{tc.expectedSuccess[i]}})
}

for i, query := range tc.errorAfter {
testutils.SucceedsSoon(t, func() error {
_, err := sqlDB.DB.ExecContext(context.Background(), query)
if err == nil {
return errors.New("expected error, found none")
}
if !testutils.IsError(err, tc.expectedError[i]) {
return errors.Newf("expected error %q, found %v", tc.expectedError[i], pgerror.FullError(err))
}
return nil
})
}
})
for i, query := range tc.errorAfter {
testutils.SucceedsSoon(t, func() error {
_, err := sqlDB.DB.ExecContext(context.Background(), query)
if err == nil {
return errors.New("expected error, found none")
}
if !testutils.IsError(err, tc.expectedError[i]) {
return errors.Newf("expected error %q, found %v", tc.expectedError[i], pgerror.FullError(err))
}
return nil
})
}
})
}
}
}

Expand Down
36 changes: 25 additions & 11 deletions pkg/ccl/backupccl/backuprand/backup_rand_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,8 @@ import (
// TestBackupRestoreRandomDataRoundtrips conducts backup/restore roundtrips on
// randomly generated tables and verifies their data and schema are preserved.
// It tests that full database backup as well as all subsets of per-table backup
// roundtrip properly.
// roundtrip properly. 50% of the time, the test runs the restore with the
// schema_only parameter, which does not restore any rows from user tables.
func TestBackupRestoreRandomDataRoundtrips(t *testing.T) {
defer leaktest.AfterTest(t)()
defer log.Scope(t).Close(t)
Expand Down Expand Up @@ -66,6 +67,11 @@ func TestBackupRestoreRandomDataRoundtrips(t *testing.T) {
}
numInserts := 20

runSchemaOnlyExtension := ""
if rng.Intn(10)%2 == 0 {
runSchemaOnlyExtension = ", schema_only"
}

tables := sqlDB.Query(t, `SELECT name FROM crdb_internal.tables WHERE
database_name = 'rand' AND schema_name = 'public'`)
var tableNames []string
Expand All @@ -87,7 +93,9 @@ database_name = 'rand' AND schema_name = 'public'`)
expectedData := make(map[string][][]string)
for _, tableName := range tableNames {
expectedCreateTableStmt[tableName] = sqlDB.QueryStr(t, fmt.Sprintf(`SELECT create_statement FROM [SHOW CREATE TABLE %s]`, tableName))[0][0]
expectedData[tableName] = sqlDB.QueryStr(t, fmt.Sprintf(`SELECT * FROM %s`, tableName))
if runSchemaOnlyExtension == "" {
expectedData[tableName] = sqlDB.QueryStr(t, fmt.Sprintf(`SELECT * FROM %s`, tableName))
}
}

// Now that we've created our random tables, backup and restore the whole DB
Expand All @@ -97,12 +105,12 @@ database_name = 'rand' AND schema_name = 'public'`)
tablesBackup := localFoo + "alltables"
dbBackups := []string{dbBackup, tablesBackup}
if err := backuputils.VerifyBackupRestoreStatementResult(
t, sqlDB, "BACKUP DATABASE rand TO $1", dbBackup,
t, sqlDB, "BACKUP DATABASE rand INTO $1", dbBackup,
); err != nil {
t.Fatal(err)
}
if err := backuputils.VerifyBackupRestoreStatementResult(
t, sqlDB, "BACKUP TABLE rand.* TO $1", tablesBackup,
t, sqlDB, "BACKUP TABLE rand.* INTO $1", tablesBackup,
); err != nil {
t.Fatal(err)
}
Expand All @@ -118,7 +126,12 @@ database_name = 'rand' AND schema_name = 'public'`)
fmt.Sprintf(`SELECT create_statement FROM [SHOW CREATE TABLE %s]`, restoreTable))[0][0]
assert.Equal(t, expectedCreateTableStmt[tableName], createStmt,
"SHOW CREATE %s not equal after RESTORE", tableName)
sqlDB.CheckQueryResults(t, fmt.Sprintf(`SELECT * FROM %s`, tableName), expectedData[tableName])
if runSchemaOnlyExtension == "" {
sqlDB.CheckQueryResults(t, fmt.Sprintf(`SELECT * FROM %s`, restoreTable), expectedData[tableName])
} else {
sqlDB.CheckQueryResults(t, fmt.Sprintf(`SELECT count(*) FROM %s`, restoreTable),
[][]string{{"0"}})
}
}
}

Expand All @@ -128,17 +141,17 @@ database_name = 'rand' AND schema_name = 'public'`)
for _, backup := range dbBackups {
sqlDB.Exec(t, "DROP DATABASE IF EXISTS restoredb")
sqlDB.Exec(t, "CREATE DATABASE restoredb")
tableQuery := fmt.Sprintf("RESTORE rand.* FROM LATEST IN $1 WITH OPTIONS (into_db='restoredb'%s)", runSchemaOnlyExtension)
if err := backuputils.VerifyBackupRestoreStatementResult(
t, sqlDB, "RESTORE rand.* FROM $1 WITH OPTIONS (into_db='restoredb')", backup,
t, sqlDB, tableQuery, backup,
); err != nil {
t.Fatal(err)
}
verifyTables(t, tableNames)
sqlDB.Exec(t, "DROP DATABASE IF EXISTS restoredb")

if err := backuputils.VerifyBackupRestoreStatementResult(
t, sqlDB, "RESTORE DATABASE rand FROM $1 WITH OPTIONS (new_db_name='restoredb')", backup,
); err != nil {
dbQuery := fmt.Sprintf("RESTORE DATABASE rand FROM LATEST IN $1 WITH OPTIONS (new_db_name='restoredb'%s)", runSchemaOnlyExtension)
if err := backuputils.VerifyBackupRestoreStatementResult(t, sqlDB, dbQuery, backup); err != nil {
t.Fatal(err)
}
verifyTables(t, tableNames)
Expand All @@ -155,8 +168,9 @@ database_name = 'rand' AND schema_name = 'public'`)
}
tables := strings.Join(combo, ", ")
t.Logf("Testing subset backup/restore %s", tables)
sqlDB.Exec(t, fmt.Sprintf(`BACKUP TABLE %s TO $1`, tables), backupTarget)
_, err := tc.Conns[0].Exec(fmt.Sprintf("RESTORE TABLE %s FROM $1 WITH OPTIONS (into_db='restoredb')", tables),
sqlDB.Exec(t, fmt.Sprintf(`BACKUP TABLE %s INTO $1`, tables), backupTarget)
_, err := tc.Conns[0].Exec(
fmt.Sprintf("RESTORE TABLE %s FROM LATEST IN $1 WITH OPTIONS (into_db='restoredb' %s)", tables, runSchemaOnlyExtension),
backupTarget)
if err != nil {
if strings.Contains(err.Error(), "skip_missing_foreign_keys") {
Expand Down
5 changes: 0 additions & 5 deletions pkg/ccl/backupccl/restore_data_processor.go
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,6 @@ import (
"github.com/cockroachdb/cockroach/pkg/ccl/backupccl/backuppb"
"github.com/cockroachdb/cockroach/pkg/ccl/storageccl"
"github.com/cockroachdb/cockroach/pkg/cloud"
"github.com/cockroachdb/cockroach/pkg/jobs/jobspb"
"github.com/cockroachdb/cockroach/pkg/keys"
"github.com/cockroachdb/cockroach/pkg/kv/bulk"
"github.com/cockroachdb/cockroach/pkg/roachpb"
Expand Down Expand Up @@ -138,10 +137,6 @@ func newRestoreDataProcessor(
) (execinfra.Processor, error) {
sv := &flowCtx.Cfg.Settings.SV

if spec.Validation != jobspb.RestoreValidation_DefaultRestore {
return nil, errors.New("Restore Data Processor does not support validation yet")
}

rd := &restoreDataProcessor{
flowCtx: flowCtx,
input: input,
Expand Down
Loading

0 comments on commit 7c18668

Please sign in to comment.