Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: thread a tenant ID #47903

Closed
tbg opened this issue Apr 22, 2020 · 0 comments · Fixed by #48190
Closed

sql: thread a tenant ID #47903

tbg opened this issue Apr 22, 2020 · 0 comments · Fixed by #48190
Assignees
Labels
A-multitenancy Related to multi-tenancy

Comments

@tbg
Copy link
Member

tbg commented Apr 22, 2020

Thread a tenant ID around according to the requirements here.

@tbg tbg added A-multitenancy Related to multi-tenancy A-multitenancy-blocker labels Apr 22, 2020
craig bot pushed a commit that referenced this issue Apr 28, 2020
47999: keys: introduce tenant keyspaces, generate SQL keys through tenant-bound key generator r=nvanbenschoten a=nvanbenschoten

First step in addressing #47903.

This PR introduces a few new important concepts to the CockroachDB codebase:
- Tenants
- Tenant IDs
- Tenant keyspaces
- The "System" tenant

A TenantID is a unique ID associated with a tenant in a multi-tenant cluster. Each tenant is granted exclusive access to a portion of the keyspace and a collection of SQL tables in that keyspace which comprise a logical cluster. The "system" tenant is the system's internal tenant in a multi-tenant cluster and the only tenant in a single-tenant cluster.

The system tenant differs from all other tenants in three important ways:
1. the system tenant's keyspace is not prefixed with a tenant specifier.
2. the system tenant is created by default during cluster initialization.
2. the system tenant had the ability to create and destroy other tenants.

The PR then combines the concepts introduced in the previous commit with a new tenant-bound key generator structure called the `TenantIDKeyGen`. This key generator is bound to a TenantID on creation and is only capable of producing SQL keys for that tenant.

The change then removes all other forms of SQL key generation. Notably, it removes all free functions in `pkg/keys` that generate SQL keys and redirects all key generation in `pkg/sql/sqlbase/index_encoding.go` through `pkg/keys`. I audited all uses of `encoding.EncodeUvarintAscending` (and `DecodeUvarint64` for C++) in the codebase to get to a point where I feel relatively confident about this. Still, it will be nice to verify this further in the future when we test SQL tenant servers and reject all cross-tenant KV traffic.

With all SQL key generation flowing through the tenant-bound key generators, the change then introduces two global key generators:
- `SystemTenantKeyGen`: a SQL key generator for the system tenant
- `TODOTenantKeyGen`: equivalent to `SystemTenantKeyGen`, but should be used when it is unclear which tenant should be referenced by the surrounding context.

These two key generators are used appropriately. Most production code uses the `TODOTenantKeyGen` for now because it will later be replaced with a real tenant-bound key generator that hangs off of its SQL server. Most testing code uses the `SystemTenantKeyGen`. Once most of the obvious uses of `TODOTenantKeyGen` are addressed in the next PR, I might put it behind a function that takes an "issueNo" to make sure that all remaining uses of it are tracked in Github.

Co-authored-by: Nathan VanBenschoten <[email protected]>
craig bot pushed a commit that referenced this issue Apr 29, 2020
46992: sql: Add Logical Column ID field to ColumnDescriptor r=rohany a=RichardJCai

The LogicalColumnID field mimics the ColumnID field however LogicalColumnID may be swapped
between two columns whereas ColumnID cannot. LogicalColumnID is referenced for virtual tables
(pg_catalog, information_schema) and most notably affects column ordering for SHOW COLUMNS.

This LogicalColumnID field support swapping the order of two columns - currently only used for
ALTER COLUMN TYPE when a shadow column is created and swapped with it's original column.

Does not affect existing behaviour.

Release note: None

47449: cli: add --cert-principal-map to client commands r=petermattis a=petermattis

Add support for the `--cert-principal-map` flag to the certs and client
commands. Anywhere we were accepting the `--certs-dir` flag, we now also
accept the `--cert-principal-map` flag.

Fixes #47300

Release note (cli change): Support the `--cert-principal-map` flag in
the `cert *` and "client" commands such as `sql`.

48138: keys: support splitting Ranges on tenant-id prefixed keys r=nvanbenschoten a=nvanbenschoten

Fixes #48122.
Relates to #47903.
Relates to #48123.

This PR contains a series of small commits that work towards the introduction of tenant-id prefixed keyspaces and begin the removal of some `keys.TODOSQLCodec` instances. This should be the only time we need to touch C++ throughout this work.

48160: storage,libroach: Check for MaxKeys when reading from intent history r=itsbilal a=itsbilal

We weren't checking for MaxKeys (or TargetBytes) being reached
in the case where we read from intent history in the MVCC scanner.
All other cases go through addAndAdvance(), which had these checks.

Almost certainly fixes #46652. Would be very surprised if it was
something else.

Release note (bug fix): Fixes a bug where a read operation in a transaction
would error out for exceeding the maximum count of results returned.

48162: opt: add rule to eliminate Exists when input has zero rows r=rytaft a=rytaft

This commit adds a new rule, `EliminateExistsZeroRows`, which
converts an `Exists` subquery to False when it's known
that the input produces zero rows.

Informs #47058

Release note (performance improvement): The optimizer can now
detect when an Exists subquery can be eliminated because the input
has zero rows. This leads to better plans in some cases.

Co-authored-by: richardjcai <[email protected]>
Co-authored-by: Peter Mattis <[email protected]>
Co-authored-by: Nathan VanBenschoten <[email protected]>
Co-authored-by: Bilal Akhtar <[email protected]>
Co-authored-by: Rebecca Taft <[email protected]>
nvanbenschoten added a commit to nvanbenschoten/cockroach that referenced this issue Apr 30, 2020
Fixes cockroachdb#47903.

Also known as "the grand plumbing", this commit replaces a few instances
of `TODOSQLCodec` in `pkg/sql/sqlbase/index_encoding.go` and watches the
house of cards fall apart. It then glues the world back together, this
time using a properly injected tenant-bound SQLCodec to encode and
decode all SQL table keys.

A tenant ID field is added to `sqlServerArgs`. This is used to construct
a tenant-bound `keys.SQLCodec` during server creation. This codec
morally lives on the `sql.ExecutorConfig`. In practice, it is also
copied onto `tree.EvalContext` and `execinfra.ServerConfig` to help
carry it around. SQL code is adapted to use this codec whenever it
needs to encode or decode keys.

If all tests pass after this refactor, there is a good chance it got
things right. This is because any use of an uninitialized SQLCodec will
panic immediately when the codec is first used. This was helpful in
ensuring that it was properly plumbed everywhere.
craig bot pushed a commit that referenced this issue Apr 30, 2020
48190: sql: inject tenant ID in sqlServerArgs, pass through ExecutorConfig r=nvanbenschoten a=nvanbenschoten

Fixes #47903.
Informs #48123.

Also known as "the grand plumbing", this change replaces a few instances of `TODOSQLCodec` in `pkg/sql/sqlbase/index_encoding.go` and watches the house of cards fall apart. It then glues the world back together, this time using a properly injected tenant-bound SQLCodec to encode and decode all SQL table keys.

A tenant ID field is added to `sqlServerArgs`. This is used to construct a tenant-bound `keys.SQLCodec` during server creation. This codec morally lives on the `sql.ExecutorConfig`. In practice, it is also copied onto `tree.EvalContext` and `execinfra.ServerConfig` to help carry it around. SQL code is adapted to use this codec whenever it needs to encode or decode keys.

If all tests pass after this refactor, there is a good chance it got things right. This is because any use of an uninitialized SQLCodec will panic immediately when the codec is first used. This was helpful in ensuring that it was properly plumbed everywhere.

48245: kv: declare write access to AbortSpan on all aborting EndTxn reqs r=nvanbenschoten a=nvanbenschoten

Fixes #43707.
Fixes #48046.
Fixes #48189.

Part of the change made by #42765 was to clear AbortSpan entries on
non-poisoning, aborting EndTxn requests. Specifically, this change was
made in 1328787. The change forgot to update the corresponding span
declaration logic to reflect the fact that we were now writing to the
AbortSpan in cases where we previously weren't.

This was triggering an assertion in race builds that tried to catch this
kind of undeclared span access. The assertion failure was very rare
because it required the following conditions to all be met:
1. running a test with the race detector enabled
2. a txn (A) must have been aborted by another txn (B)
3. txn B must have cleared an intent on txn A's transaction record range
4. txn A must have noticed and issued a non-poisoning EndTxn(ABORT)

We should backport this when we get a change (once v20.1.0 has
stabilized), but I don't expect that this could actually cause any
issues. The AbortSpan update was strictly a matter of performance and we
should never be racing with another request that is trying to read the
same AbortSpan entry.

Co-authored-by: Nathan VanBenschoten <[email protected]>
@craig craig bot closed this as completed in 164f82b Apr 30, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-multitenancy Related to multi-tenancy
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants