Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

server: migrate existing databases to the rbr system table #94843

Closed
jeffswenson opened this issue Jan 6, 2023 · 1 comment
Closed

server: migrate existing databases to the rbr system table #94843

jeffswenson opened this issue Jan 6, 2023 · 1 comment
Assignees
Labels
A-multiregion Related to multi-region C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) T-shared-systems Shared Systems Team

Comments

@jeffswenson
Copy link
Collaborator

jeffswenson commented Jan 6, 2023

#85736 added alternative implementations of system.sqlliveness, system.sql_instances, and system.lease that have a primary key index compatible with regional by row (RBR) tables. Using the RBR compatible tables requires setting the COCKROACH_MR_SYSTEM_DATABASE environment variable.

We should implement a migration that converts the legacy system tables into this new format. That allows us to get rid of the environment variable and migrate tenants from single region hosts into multi-region hosts.

Epic: CRDB-18596

Jira issue: CRDB-23180

@jeffswenson jeffswenson added C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) A-multiregion Related to multi-region T-shared-systems Shared Systems Team labels Jan 6, 2023
@jeffswenson jeffswenson self-assigned this Jan 6, 2023
jeffswenson added a commit to jeffswenson/cockroach that referenced this issue Feb 22, 2023
The SessionID encodes a region prefix and a UUID. The initial
implementation of RBR sqlliveness stored the crdb_region and the
session_uuid. Now, sqlliveness stores the crdb_region and value of the
SessionID.

```
SELECT * FROM system.sqlliveness;
                 session_id                |           expiration           | crdb_region
-------------------------------------------+--------------------------------+--------------
  \x01018028b91818a7584268992bb246f9755b37 | 1676502803528637593.0000000000 | \x80
(1 row)
```

Storing the complete SessionID takes an extra 3 bytes of storage but it
makes it simpler to inspect the state of the system and join with tables
that refer to the session_id. For example, it's possible to write this
query:

```
SELECT *
FROM system.sqlliveness AS s
LEFT JOIN ON system.jobs AS j
WHERE s.session_id = j.session_id
AND s.crdb_region = 'us-central-1';
```

Part of cockroachdb#94843

Release note: None
jeffswenson added a commit to jeffswenson/cockroach that referenced this issue Feb 28, 2023
The SessionID encodes a region prefix and a UUID. The initial
implementation of RBR sqlliveness stored the crdb_region and the
session_uuid. Now, sqlliveness stores the crdb_region and value of the
SessionID.

```
SELECT * FROM system.sqlliveness;
                 session_id                |           expiration           | crdb_region
-------------------------------------------+--------------------------------+--------------
  \x01018028b91818a7584268992bb246f9755b37 | 1676502803528637593.0000000000 | \x80
(1 row)
```

Storing the complete SessionID takes an extra 3 bytes of storage but it
makes it simpler to inspect the state of the system and join with tables
that refer to the session_id. For example, it's possible to write this
query:

```
SELECT *
FROM system.sqlliveness AS s
LEFT JOIN ON system.jobs AS j
WHERE s.session_id = j.session_id
AND s.crdb_region = 'us-central-1';
```

Part of cockroachdb#94843

Release note: None
jeffswenson added a commit to jeffswenson/cockroach that referenced this issue Mar 2, 2023
Add the settingswatcher.VersionGuard utility. VersionGuard makes it
possible to check a cluster version in the context of a transaction.
Until a version gate is reached, VersionGuard will look up the version
set in the settings table. Once a version is reached, the guard can skip
the KV read because versions never go backwards.

The version guard will be used to migrate the sqlinstance, sql_liveness,
lease system tables to a regional by row compatible index. The in-memory
IsActive check is insufficient for the migration, because the setting
can change in between when the transaction starts and when the
transaction commits.

Part of cockroachdb#94843
jeffswenson added a commit to jeffswenson/cockroach that referenced this issue Mar 3, 2023
Add the settingswatcher.VersionGuard utility. VersionGuard makes it
possible to check a cluster version in the context of a transaction.
Until a version gate is reached, VersionGuard will look up the version
set in the settings table. Once a version is reached, the guard can skip
the KV read because versions never go backwards.

The version guard will be used to migrate the sqlinstance, sql_liveness,
lease system tables to a regional by row compatible index. The in-memory
IsActive check is insufficient for the migration, because the setting
can change in between when the transaction starts and when the
transaction commits.

Part of cockroachdb#94843
jeffswenson added a commit to jeffswenson/cockroach that referenced this issue Mar 3, 2023
Add the settingswatcher.VersionGuard utility. VersionGuard makes it
possible to check a cluster version in the context of a transaction.
Until a version gate is reached, VersionGuard will look up the version
set in the settings table. Once a version is reached, the guard can skip
the KV read because versions never go backwards.

The version guard will be used to migrate the sqlinstance, sql_liveness,
lease system tables to a regional by row compatible index. The in-memory
IsActive check is insufficient for the migration, because the setting
can change in between when the transaction starts and when the
transaction commits.

Part of cockroachdb#94843
craig bot pushed a commit that referenced this issue Mar 3, 2023
97935: settingswatcher: add version guard utility r=JeffSwenson a=JeffSwenson

Add the settingswatcher.VersionGuard utility. VersionGuard makes it possible to check a cluster version in the context of a transaction. Until a version gate is reached, VersionGuard will look up the version set in the settings table. Once a version is reached, the guard can skip the KV read because versions never go backwards.

The version guard will be used to migrate the sqlinstance, sql_liveness, lease system tables to a regional by row compatible index. The in-memory IsActive check is insufficient for the migration, because the setting can change in between when the transaction starts and when the transaction commits.

Part of #94843

Co-authored-by: Jeff <[email protected]>
jeffswenson added a commit to jeffswenson/cockroach that referenced this issue Mar 7, 2023
Previously, if the rangfeedcache.Watcher ran into an auth error, it was
never reported to the caller. This was detected as part of cockroachdb#97991.
Moving the settingswatcher to be the first component initialized in the
sql server caused the TestTenantUnauthenticatedAccess test to fail. Now,
the error is reported to the caller and rebasing cockroachdb#97991 on top of this
change allows the test to pass.

Part of cockroachdb#94843

Release note: None
jeffswenson added a commit to jeffswenson/cockroach that referenced this issue Mar 7, 2023
Previously, the settingswatcher was one of the last services initialized
during sql server start up. Now, it is one of the first services to
start up. The settings do not have a well defined value until after
settingwatcher initializes. Some settings, like cluster version, must
have valid values during initialization.

Part of cockroachdb#94843
jeffswenson added a commit to jeffswenson/cockroach that referenced this issue Mar 7, 2023
Previously, if the rangfeedcache.Watcher ran into an auth error, it was
never reported to the caller. This was detected as part of cockroachdb#97991.
Moving the settingswatcher to be the first component initialized in the
sql server caused the TestTenantUnauthenticatedAccess test to fail. Now,
the error is reported to the caller and rebasing cockroachdb#97991 on top of this
change allows the test to pass.

Part of cockroachdb#94843

Release note: None
craig bot pushed a commit that referenced this issue Mar 7, 2023
97805: settings: add COCKROACH_IGNORE_CLUSTER_SETTINGS env var r=dt a=dt

This provides an override usable in siturations where the persisted values for cluster settings may be causing a problem such that one cannot even start and interact with the the cluster enough to update them, such as if a given persisted setting causes nodes to crash on start or reject all SQL connections. In such cases, this env var provides a mechanism to start a node which will ignore the persisted values for all cluster settings and operate with just the default values instead, allowing alterations of the persisted values (though those alterations will similarly be ignored).

Release note (ops change): The enviornment variable COCKROACH_IGNORE_CLUSTER_SETTINGS can be used to start a node that ignores all stored cluster setting values in emergencies.
Epic: none.

97948: opt: don't fold sub-operators with null operands during type check r=DrewKimball a=DrewKimball

This commit prevents type-checking from replacing expressions like `NULL = ANY(...)` with `NULL`. This is necessary because in the case when the right operand is an empty array, the result of the expression is `False` instead of `NULL`. It is not always possible to know what the right operand will evaluate to, and constant folding can be handled during normalization anyway.

Fixes #37793

Release note (bug fix): Fixed a long-standing and rare bug in evaluation of `ANY`, `SOME`, and `ALL` sub-operators that would cause expressions like `NULL = ANY(ARRAY[]::INT[])` to return `NULL` instead of `False`.

98111: rangefeedcache: handle auth error in initial scan r=JeffSwenson a=JeffSwenson

Previously, if the rangfeedcache.Watcher ran into an auth error, it was never reported to the caller. This was detected as part of #97991. Moving the settingswatcher to be the first component initialized in the sql server caused the TestTenantUnauthenticatedAccess test to fail. Now, the error is reported to the caller and rebasing #97991 on top of this change allows the test to pass.

Part of #94843

Release note: None

98143: ui: remove timeScale update from fingerprint details link from insight details r=ericharmeling a=ericharmeling

Part of #97952.

This PR removes the setTimeScale call from the StatementDetailsLink
and TransactionDetailsLink functions. Navigating to the Fingerprint Details
pages from the Insights Details pages will no longer set the global time scale.

Loom (CC Console): https://www.loom.com/share/ca388c8c0068446d960f89cc6c1f6af5
Loom (DB Console): https://www.loom.com/share/209f2d21a1784d1d86adffabddb9b47f

Epic: none

Release note: None

Co-authored-by: David Taylor <[email protected]>
Co-authored-by: Drew Kimball <[email protected]>
Co-authored-by: Jeff <[email protected]>
Co-authored-by: Eric Harmeling <[email protected]>
jeffswenson added a commit to jeffswenson/cockroach that referenced this issue Mar 13, 2023
Migration sql_instance to a regional by row compatible idnex. The
version gates are intended to follow the protocol discussed in the
comment at the top of upgrades/system_rbr_indexes.go

The crdb_region column ID was changed from 5 to 6 in order to match the
logical order in which the sql_addr and crdb_region columns were added.
The exact ID doesn't really matter in this case since the sql_addr
column was added in v23.1.

Most of the rbr migration work is the same for sqlliveness, lease, and
sql_instances. The main exception to that is the migration cache used by
the sql instance reader. The cache is backed by a range feed and we need
to switch implementations when the Version setting changes.

Part of cockroachdb#94843

Relase note: None
jeffswenson added a commit to jeffswenson/cockroach that referenced this issue Mar 13, 2023
Migration sql_instance to a regional by row compatible idnex. The
version gates are intended to follow the protocol discussed in the
comment at the top of upgrades/system_rbr_indexes.go

The crdb_region column ID was changed from 5 to 6 in order to match the
logical order in which the sql_addr and crdb_region columns were added.
The exact ID doesn't really matter in this case since the sql_addr
column was added in v23.1.

Most of the rbr migration work is the same for sqlliveness, lease, and
sql_instances. The main exception to that is the migration cache used by
the sql instance reader. The cache is backed by a range feed and we need
to switch implementations when the Version setting changes.

Part of cockroachdb#94843

Relase note: None
jeffswenson added a commit to jeffswenson/cockroach that referenced this issue Mar 13, 2023
Migrate sqlliveness to the regional by row compatible index. The version
gates are intended to follow the protocol discussed in the comment at
the top of upgrades/system_rbr_indexes.go.

As part of this change, the legacy session id encoding is treated as
belonging to the enum.One region. Treating the legacy encoding as
belonging to enum.One simplifies the migration as it is not necessary to
wait for old sessions to expire or clear old sessions from the jobs
table.

Part of cockroachdb#94843

Release note: none
craig bot pushed a commit that referenced this issue Mar 14, 2023
98463: sqlliveness: migrate to rbr compatible index r=JeffSwenson a=JeffSwenson

Migrate sqlliveness to the regional by row compatible index. The version
gates are intended to follow the protocol discussed in the comment at
the top of upgrades/system_rbr_indexes.go.

As part of this change, the legacy session id encoding is treated as
belonging to the enum.One region. Treating the legacy encoding as
belonging to enum.One simplifies the migration as it is not necessary to
wait for old sessions to expire or clear old sessions from the jobs
table.

Part of #94843

Release note: none

98500: sql: disallow creating partial stats by default r=mgartner a=mgartner

This commit adds the `enable_create_stats_using_extremes` session
setting which allows users to run `CREATE STATISTICS .. USING EXTREMES`
when enabled. It is disabled by default.

This commit also presents a "feature not supported" error when a user
tries to create partial statistics with a `WHERE` clause. This feature
has not yet been implemented.

Epic: CRDB-19449

Fixes #95233 

Release note: None


98525: build: update version.txt with the next release r=lidorcarmel a=lidorcarmel

releasing v23.1.0-alpha.6, updating version.txt to alpha.7.

Epic: none

Release note: None

Co-authored-by: Jeff <[email protected]>
Co-authored-by: Marcus Gartner <[email protected]>
Co-authored-by: Lidor Carmel <[email protected]>
jeffswenson added a commit to jeffswenson/cockroach that referenced this issue Mar 14, 2023
Migration sql_instance to a regional by row compatible idnex. The
version gates are intended to follow the protocol discussed in the
comment at the top of upgrades/system_rbr_indexes.go

The crdb_region column ID was changed from 5 to 6 in order to match the
logical order in which the sql_addr and crdb_region columns were added.
The exact ID doesn't really matter in this case since the sql_addr
column was added in v23.1.

Most of the rbr migration work is the same for sqlliveness, lease, and
sql_instances. The main exception to that is the migration cache used by
the sql instance reader. The cache is backed by a range feed and we need
to switch implementations when the Version setting changes.

Part of cockroachdb#94843

Relase note: None
jeffswenson added a commit to jeffswenson/cockroach that referenced this issue Mar 14, 2023
Migrate sql_instance to a regional by row compatible index. The version
gates are intended to follow the protocol discussed in the comment at
the top of upgrades/system_rbr_indexes.go

The crdb_region column ID was changed from 5 to 6 in order to match the
logical order in which the sql_addr and crdb_region columns were added.
The exact ID doesn't really matter in this case since the sql_addr
column was added in v23.1.

Most of the rbr migration work is the same for sqlliveness, lease, and
sql_instances. The main exception to that is the migration cache used by
the sql instance reader. The cache is backed by a range feed and we need
to switch implementations when the version setting changes.

Part of cockroachdb#94843

Relase note: None
jeffswenson added a commit to jeffswenson/cockroach that referenced this issue Mar 14, 2023
Migrate the system.lease table to an index that is byte compatible with
a regional by row index. The version gates are intended to follow the
protocol discussed in the comment at the top of
upgrades/system_rbr_indexes.go

Writing to the lease table always occurs throught he kv API. The table
is read via the SQL api. The SQL uses of the system.lease table are
forwards compatible as long as they do not need to read the crdb_region.
Queries that interact with individual leases need to retrieve the
crdb_region and must consult the version gate.

Part of cockroachdb#94843

Release Note: None
jeffswenson added a commit to jeffswenson/cockroach that referenced this issue Mar 14, 2023
Migrate sql_instance to a regional by row compatible index. The version
gates are intended to follow the protocol discussed in the comment at
the top of upgrades/system_rbr_indexes.go

The crdb_region column ID was changed from 5 to 6 in order to match the
logical order in which the sql_addr and crdb_region columns were added.
The exact ID doesn't really matter in this case since the sql_addr
column was added in v23.1.

Most of the rbr migration work is the same for sqlliveness, lease, and
sql_instances. The main exception to that is the migration cache used by
the sql instance reader. The cache is backed by a range feed and we need
to switch implementations when the version setting changes.

Part of cockroachdb#94843

Relase note: None
craig bot pushed a commit that referenced this issue Mar 14, 2023
98295: c2c: refactor roachtest driver to run and stream arbitrary workloads r=stevendanna a=msbutler

This patch refactors the roachtest driver such that:
1) the streamingWorkload interface can run a custom workload with arbitrary sql
queries.
2) to reduce helper function signature bloat, many helper functions are now
replicationTestSpec methods.
3) the test writer can specity an `additionalDuration` of 0, which allows the
workload to terminate on its own.
4) a health monitor will fail the test if it cannot connect to a node

This patch also adds two new roachtests:
- c2c/BulkOps: runs the backup/mvcc-range-tombstones roachtest on
  the source cluster (without the backup-restore roundtrips for now), and
  streams it to the destination.
- c2c/UnitTest: is quick roachtest that can be used to debug the c2c roachtest
  infrastructure.

Informs #89176

Release note: None

98445: sql_instance: migrate to rbr compatible index r=JeffSwenson a=JeffSwenson

Migrate sql_instance to a regional by row compatible index. The version
gates are intended to follow the protocol discussed in the comment at
the top of upgrades/system_rbr_indexes.go

The crdb_region column ID was changed from 5 to 6 in order to match the
logical order in which the sql_addr and crdb_region columns were added.
The exact ID doesn't really matter in this case since the sql_addr
column was added in v23.1.

Most of the rbr migration work is the same for sqlliveness, lease, and
sql_instances. The main exception to that is the migration cache used by
the sql instance reader. The cache is backed by a range feed and we need
to switch implementations when the version setting changes.

Part of #94843

Relase note: None

Co-authored-by: Michael Butler <[email protected]>
Co-authored-by: Jeff <[email protected]>
jeffswenson added a commit to jeffswenson/cockroach that referenced this issue Mar 15, 2023
Previously, the v22.2-v23.1 mixed version test was using the v22.2
version gate with a v23.1 schema. This caused the alter_table test to
fail on PR cockroachdb#98446. The lease protocol only works if the version gates
and the schema are in sync.

As part of this change I simplified the version configuration in the
logic tests so that the only version key is the bootstrap version. It
doesn't really make sense to specify a binary version different than the
build version. The binary will always have the build version's behavior.

This change also deletes the local-v1.1-at-v1.0-noupgrade test. The
version can't be described by a version key and the tests do not
represent a situation that occurs in real world deployments.

Part of cockroachdb#94843

Release note: None
jeffswenson added a commit to jeffswenson/cockroach that referenced this issue Mar 15, 2023
Migrate the system.lease table to an index that is byte compatible with
a regional by row index. The version gates are intended to follow the
protocol discussed in the comment at the top of
upgrades/system_rbr_indexes.go

Writing to the lease table always occurs throught he kv API. The table
is read via the SQL api. The SQL uses of the system.lease table are
forwards compatible as long as they do not need to read the crdb_region.
Queries that interact with individual leases need to retrieve the
crdb_region and must consult the version gate.

Part of cockroachdb#94843

Release Note: None
jeffswenson added a commit to jeffswenson/cockroach that referenced this issue Mar 15, 2023
Previously, the v22.2-v23.1 mixed version test was using the v22.2
version gate with a v23.1 schema. This caused the alter_table test to
fail on PR cockroachdb#98446. The lease protocol only works if the version gates
and the schema are in sync.

As part of this change I simplified the version configuration in the
logic tests so that the only version key is the bootstrap version. It
doesn't really make sense to specify a binary version different than the
build version. The binary will always have the build version's behavior.

This change also deletes the local-v1.1-at-v1.0-noupgrade test. The
version can't be described by a version key and the tests do not
represent a situation that occurs in real world deployments.

Part of cockroachdb#94843

Release note: None
jeffswenson added a commit to jeffswenson/cockroach that referenced this issue Mar 15, 2023
Previously, the v22.2-v23.1 mixed version test was using the v22.2
version gate with a v23.1 schema. This caused the alter_table test to
fail on PR cockroachdb#98446. The lease protocol only works if the version gates
and the schema are in sync.

As part of this change I simplified the version configuration in the
logic tests so that the only version key is the bootstrap version. It
doesn't really make sense to specify a binary version different than the
build version. The binary will always have the build version's behavior.

This change also deletes the local-v1.1-at-v1.0-noupgrade test. The
version can't be described by a version key and the tests do not
represent a situation that occurs in real world deployments.

Part of cockroachdb#94843

Release note: None
craig bot pushed a commit that referenced this issue Mar 15, 2023
98695: logictest: use v22_2 schema in mixed version test r=JeffSwenson a=JeffSwenson

Previously, the v22.2-v23.1 mixed version test was using the v22.2 version gate with a v23.1 schema. This caused the alter_table test to fail on PR #98446. The lease protocol only works if the version gates and the schema are in sync.

As part of this change I simplified the version configuration in the logic tests so that the only version key is the bootstrap version. It doesn't really make sense to specify a binary version different than the build version. The binary will always have the build version's behavior.

This change also deletes the local-v1.1-at-v1.0-noupgrade test. The version can't be described by a version key and the tests do not represent a situation that occurs in real world deployments.

Part of #94843

Release note: None

98697: sql: make sure mixed version logic tests are enabled for declarative schema changer enabled DDL in 22.2 r=chengxiong-ruan a=chengxiong-ruan

Informs: #98637

This PR makes sure we enable mixed version logic tests for ddl statements enabled in declarative schema changer in 22.2.

We already have mixed version test coverage for `CREATE INDEX`, `DROP INDEX`, `ADD COLUMN`, `DROP COLUMN` and `ADD CONSTRAINT`.

Each commit turns on mixed version logic test for each of `DROP OWNED BY`, `COMMENT ON` and `ALTER PRIMARY KEY`.

98711: storage,kv: reenable rewrite concurrency and add format error checking r=jbowens a=sumeerbhola

Epic: none

Fixes: #97076

Release note: None

Co-authored-by: Jeff <[email protected]>
Co-authored-by: Chengxiong Ruan <[email protected]>
Co-authored-by: sumeerbhola <[email protected]>
jeffswenson added a commit to jeffswenson/cockroach that referenced this issue Mar 16, 2023
Migrate the system.lease table to an index that is byte compatible with
a regional by row index. The version gates are intended to follow the
protocol discussed in the comment at the top of
upgrades/system_rbr_indexes.go

Writing to the lease table always occurs throught he kv API. The table
is read via the SQL api. The SQL uses of the system.lease table are
forwards compatible as long as they do not need to read the crdb_region.
Queries that interact with individual leases need to retrieve the
crdb_region and must consult the version gate.

Part of cockroachdb#94843

Release Note: None
craig bot pushed a commit that referenced this issue Mar 16, 2023
98446: lease: migrate lease to rbr compatible index r=JeffSwenson a=JeffSwenson

Migrate the system.lease table to an index that is byte compatible with
a regional by row index. The version gates are intended to follow the
protocol discussed in the comment at the top of
upgrades/system_rbr_indexes.go

Writing to the lease table always occurs throught he kv API. The table
is read via the SQL api. The SQL uses of the system.lease table are
forwards compatible as long as they do not need to read the crdb_region.
Queries that interact with individual leases need to retrieve the
crdb_region and must consult the version gate.

Part of #94843

Release Note: None

Co-authored-by: Jeff <[email protected]>
@jeffswenson
Copy link
Collaborator Author

This is complete as of #98446.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-multiregion Related to multi-region C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) T-shared-systems Shared Systems Team
Projects
None yet
Development

No branches or pull requests

1 participant