-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kv: Add API to allow for split/scatter behavior in multi-tenant environments #74389
Labels
A-kv-distribution
Relating to rebalancing and leasing.
A-multitenancy
Related to multi-tenancy
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
T-kv
KV Team
Comments
ajstorm
added
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
A-kv-distribution
Relating to rebalancing and leasing.
A-multitenancy
Related to multi-tenancy
labels
Jan 3, 2022
shralex
added a commit
to shralex/cockroach
that referenced
this issue
Jan 16, 2022
This adds support for AdminSplit and AdminScatter for secondary tenants. This API allows indicating to KV that more data will be ingested, and so the range should be split and re-distributed across the cluster. This API will not be exposed through SQL, and in the future we might change it to give KV more control over whether and how to deal with expected ingest load. More discussion can be found in the github issue: cockroachdb#74389 and Epic: https://cockroachlabs.atlassian.net/browse/CRDB-10720 Release Note: None Release note (<category, see below>): <what> <show> <why>
shralex
added a commit
to shralex/cockroach
that referenced
this issue
Jan 18, 2022
This adds support for AdminSplit and AdminScatter for secondary tenants. This API allows indicating to KV that more data will be ingested, and so the range should be split and re-distributed across the cluster. This API will not be exposed through SQL, and in the future we might change it to give KV more control over whether and how to deal with expected ingest load. More discussion can be found in the github issue: cockroachdb#74389 and Epic: https://cockroachlabs.atlassian.net/browse/CRDB-10720 Release note: None
craig bot
pushed a commit
that referenced
this issue
Jan 19, 2022
74706: sql: add NOSQLLOGIN role which restricts SQL CLI only r=rafiss,kylepatron-cockroachlabs a=dhartunian Previously, in order to restrict user login ability, the `LOGIN` and `NOLOGIN` role options were available which would restrict both SQL and DB Console login ability. This change adds the `NOSQLLOGIN` (and its inverse: `SQLLOGIN`) role option in order to provide the ability to disable SQL CLI logins from users while retaining DB Console login ability. Resolves #74482 Release note (sql change): A new role option is now available, `NOSQLLOGIN` (and its inverse `SQLLOGIN`), which restricts SQL CLI login ability for a user while retaining their ability to login to the DB Console (as opposed to `NOLOGIN` which restricts both SQL and DB Console). Without any role options all login behavior remains permitted as it does today. OIDC logins to the DB Console continue to be permitted with `NOSQLLOGIN` set. 74835: kvserver: add AdminSplit and AdminScatter to secondary tenants API r=shralex a=shralex This adds support for AdminSplit and AdminScatter for secondary tenants. This API allows indicating to KV that more data will be ingested, and so the range should be split and re-distributed across the cluster. This API will not be exposed through SQL, and in the future we might change it to give KV more control over whether and how to deal with expected ingest load. More discussion can be found in the github issue: #74389 and Epic: https://cockroachlabs.atlassian.net/browse/CRDB-10720 Release Note: None 74922: sql: clean up mutable not-null columns hack r=RaduBerinde a=RaduBerinde Mutation columns in some cases need to be scanned even if they haven't been backfilled yet, which means that we may retrieve NULL values even if they are marked as not-nullable. We currently have a hack in the table descriptor which changes the nullable flags in the column descriptors when `ReadableColumns()` is used. It is very surprising that we can get different descriptors for a given ColumnID depending if we look for it in `ReadableColumns()` or in `AllColumns()` (e.g. via FindColumnWithID). This commit cleans this up, changing the scanning code to check for `Public()` instead. Release note: None Co-authored-by: David Hartunian <[email protected]> Co-authored-by: shralex <[email protected]> Co-authored-by: Radu Berinde <[email protected]>
gtr
pushed a commit
to gtr/cockroach
that referenced
this issue
Jan 24, 2022
This adds support for AdminSplit and AdminScatter for secondary tenants. This API allows indicating to KV that more data will be ingested, and so the range should be split and re-distributed across the cluster. This API will not be exposed through SQL, and in the future we might change it to give KV more control over whether and how to deal with expected ingest load. More discussion can be found in the github issue: cockroachdb#74389 and Epic: https://cockroachlabs.atlassian.net/browse/CRDB-10720 Release note: None
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
A-kv-distribution
Relating to rebalancing and leasing.
A-multitenancy
Related to multi-tenancy
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
T-kv
KV Team
Bulk jobs currently send AdminSplit and AdminScatter requests to KV when it thinks it has sent a lot of data to one range, before it keeps sending, to give kv a chance to make a new empty range and move it somewhere. Perhaps most importantly, it gives kv a chance to split that new range off and move it to a new node before the bulk job proceeds to go and send it a bunch more data, that would then all need to be rebalanced over if we instead waited for the range / allocator to decide it is over-full and split and move on its own. If we can get it split and moved before we fill it, it's much cheaper.
But we don't do any of this when a tenant is the one generating and sending data. Of course, deciding a range is too full and should be split, or a node has too much on it and should rebalance replicas away, is obviously kv/allocator work, but the bulk job has information that the range and allocator doesn't here, which is that it is going to be sending more data so it might be advantageous to split and rebalance it now, while it is cheap.
One idea that was proposed was to create an advisory request that bulk could send to KV to indicate when a proactive split may be beneficial. The thinking is that if bulk could send some indication that it's about to send a large number of sequential keys to a given sst, KV could use that info to determine if/when to split. See parent Epic for the full discussion.
Initially, for 22.1, however we want to just open the existing AdminSplit and AdminScatter APIs to tenants, while still blocking them at the SQL statement level, and add monitoring on number of spans per tenant, so existing code will work with party between tenant and dedicated clusters.
Epic: CRDB-10720
The text was updated successfully, but these errors were encountered: