Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: avoid tenant ID reuse #100613

Closed
wants to merge 2 commits into from

Conversation

knz
Copy link
Contributor

@knz knz commented Apr 4, 2023

Prior to this patch, DROP TENANT would delete rows from system.tenants and CREATE TENANT would reuse IDs not already in the table.

Tenant ID reuse is known to incur correctness problems in the following two cases:

  • we do not yet wait for server shutdown when the service mode is changed to 'NONE'. This makes it possible to start dropping a tenant and then reusing its ID before servers from the previous record have fully shut down. This creates a possibility for a past server to start serving traffic for a freshly created tenant, which is unacceptable.

  • we have a cache of tenant capabilites on every node. We do not yet implement an invalidation protocol to clear that cache upon tenant deletion. This makes it possible for requests from a new tenant to be authorized using the capabilities of a previous tenant, which is also unacceptable.

Until we implement the necessary synchronization, we can avoid the correctness issues by preventing ID reuse.

This patch achieves this -- by preventing the deletion of rows in system.tenants.

Release note: None
Epic: CRDB-23559
Fixes #100615.

@knz knz requested review from dt and ecwall April 4, 2023 17:56
@knz knz requested a review from a team as a code owner April 4, 2023 17:56
@blathers-crl
Copy link

blathers-crl bot commented Apr 4, 2023

It looks like your PR touches production code but doesn't add or edit any test code. Did you consider adding tests to your PR?

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Contributor

@ecwall ecwall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @dt and @knz)


pkg/multitenant/mtinfopb/info.go line 100 at r1 (raw file):

	"ready":   DataStateReady,
	"drop":    DataStateDrop,
	"deleted": DataStateDeleted,

Are drop(ped) and delete(d) synonyms here? I think the name is confusing if DataStateDrop means that the drop/delete process started and DataStateDeleted means that the drop/delete process finished.

Copy link
Contributor Author

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @dt and @ecwall)


pkg/multitenant/mtinfopb/info.go line 100 at r1 (raw file):

Previously, ecwall (Evan Wall) wrote…

Are drop(ped) and delete(d) synonyms here? I think the name is confusing if DataStateDrop means that the drop/delete process started and DataStateDeleted means that the drop/delete process finished.

yes "drop" means the data is in the process of being deleted; whereas "deleted" means this has completed. Feel free to propose better names.

Copy link
Contributor

@ecwall ecwall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @dt and @knz)


pkg/multitenant/mtinfopb/info.go line 100 at r1 (raw file):

Previously, knz (Raphael 'kena' Poss) wrote…

yes "drop" means the data is in the process of being deleted; whereas "deleted" means this has completed. Feel free to propose better names.

I don't care if we go with drop or deleted, but I think both should use the same term and the difference between the two should indicate if it is started or completed:

"dropping":    DataStateDropping,
"dropped":    DataStateDropped

or more explicitly

"drop started":    DataStateDropStarted,
"drop completed":    DataStateDropCompleted,

This specific data state for secondary tenants indicate that the
tenant keyspace is queued for asynchronous deletion by a GC jobs;
however, it may not have been deleted yet.

Therefore, the visual representation of that state is best named
"dropping" instead of "drop".

Release note: None
Prior to this patch, `DROP TENANT` would delete rows from
`system.tenants` and `CREATE TENANT` would reuse IDs not already in
the table.

Tenant ID reuse is known to incur correctness problems in the
following two cases:

- we do not yet wait for server shutdown when the service mode
  is changed to 'NONE'. This makes it possible to start
  dropping a tenant and then reusing its ID before servers
  from the previous record have fully shut down. This creates
  a possibility for a past server to start serving traffic
  for a freshly created tenant, which is unacceptable.

- we have a cache of tenant capabilites on every node. We do not yet
  implement an invalidation protocol to clear that cache upon tenant
  deletion. This makes it possible for requests from a new tenant
  to be authorized using the capabilities of a previous tenant, which
  is also unacceptable.

Until we implement the necessary synchronization, we can avoid the
correctness issues by preventing ID reuse.

This patch achieves this -- by preventing the deletion of rows in
system.tenants.

Release note: None
@knz knz force-pushed the 20230404-disable-drop-tenant branch from db8ff21 to 15e46cd Compare April 12, 2023 12:59
@knz
Copy link
Contributor Author

knz commented Apr 12, 2023

Replaced with #101322

@knz knz closed this Apr 12, 2023
craig bot pushed a commit that referenced this pull request Apr 12, 2023
101266: sqlsmith: add params and return types to udfs r=rharding6373 a=rharding6373

This PR adds up to 9 parameters of any type. It also adds return types besides records and ensures that the last statement in the UDF body returns that type.

It also disables CTEs while generating the statements for the UDF body, since CTEs are not currently supported.

Epic: CRDB-20370
Informs: #90782

Release note: None

101303: multitenant: display the dropping state as "dropping" not "drop" r=ecwall a=knz

Requested by `@ecwall` [here](#100613 (review)).
Needed for #100613.

This specific data state for secondary tenants indicate that the tenant keyspace is queued for asynchronous deletion by a GC jobs; however, it may not have been deleted yet.

Therefore, the visual representation of that state is best named "dropping" instead of "drop".

Release note: None
Epic: CRDB-23559

Co-authored-by: rharding6373 <[email protected]>
Co-authored-by: Raphael 'kena' Poss <[email protected]>
craig bot pushed a commit that referenced this pull request Apr 19, 2023
101322: sql: avoid tenant ID reuse r=stevendanna a=knz

As per [internal conversation](https://cockroachlabs.slack.com/archives/C02HWA24541/p1680631136839209).

We really want to use different tenant IDs every time, to avoid tenant
ID reuse. For this, we use a new sequence `system.tenant_id_seq`.

However, there are two obstacles that prevent us from using only the
sequence.

The first is that the sequence is added in a migration and at the
point this function is called the migration may not have been
run yet.

Separately, we also have the function
`crdb_internal.create_tenant()` which can define an arbitrary tenant
ID.

So we proceed as follows:
- we find the maximum tenant ID that has been used so far. This covers
  cases where the migration has not been run yet and also arbitrary
  ID selection by create_tenant().
- we also find the next value of the sequence, if it exists already.
- we take the maximum of the two values (plus one) as the next ID.
- we also update the sequence with the new value we've chosen.


Fixes #100615.
Replaces #100613 if successful.
Epic: CRDB-23559

Co-authored-by: Raphael 'kena' Poss <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

sql: avoid tenant ID reuse
3 participants