Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql,kv: the builtin list_sql_keys_in_range emits requests outside of the tenant keyspace #95006

Closed
knz opened this issue Jan 10, 2023 · 0 comments · Fixed by #95100
Closed

sql,kv: the builtin list_sql_keys_in_range emits requests outside of the tenant keyspace #95006

knz opened this issue Jan 10, 2023 · 0 comments · Fixed by #95100
Assignees
Labels
A-kv-observability A-multitenancy Related to multi-tenancy A-sql-builtins SQL built-in functions and semantics thereof. C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions)

Comments

@knz
Copy link
Contributor

knz commented Jan 10, 2023

Describe the problem

The builtin function crdb_internal.list_sql_keys_in_range() currently errors out in secondary tenans

ERROR: RangeIterator failed to seek to /Meta2/"\x00": rpc error: code = Unauthenticated desc = requested key /Meta2/"\x00" not fully contained in tenant keyspace /Tenant/1{2-3}

Expected behavior

  • for ranges that contain the requesting tenant's data, the builtin should work and return results without errors.
  • for ranges that don't contain the requesting tenant's data, we should have a check:
    • if the tenant is the system tenant, allow the request to go through
    • otherwise return a clear error

Jira issue: CRDB-23269

Epic CRDB-23344

@knz knz added C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. A-sql-builtins SQL built-in functions and semantics thereof. A-multitenancy Related to multi-tenancy A-kv-observability T-kv-observability labels Jan 10, 2023
@ecwall ecwall self-assigned this Jan 11, 2023
@ecwall ecwall added the T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions) label Jan 11, 2023
craig bot pushed a commit that referenced this issue Jan 19, 2023
94355: pgwire: support decoding binary regXXX types r=rafiss a=jordanlewis

Fixes #94353

Epic: None

Release note (bug fix): support receiving regXXX type values in binary extended protocol

94372: roachtest: change fingerprinting builtin in c2c roachtest r=msbutler,stevendanna a=adityamaru

If no rangekeys are encountered during
fingerprinting the underlying SSTWriter does not write
any data to the SST, but still produces a non-empty
SST file. This file contains an empty data block, properties and
a footer that amount to 795 bytes. Since the file does
not contain any rangekeys it is going to be thrown away
on the client side, and so ferrying this file in the
ExportResponse is wasteful.

Experiments on the c2c/kv0 roachtest showed that fingerprinting
a single range ~250MB would result in 600k+ empty files. This magnitude
of pagination was all attributable to the elastic CPU limiter that
preempts an ExportRequest if it is consuming more CPU than it
was allotted. 600k+ empty files meant that we were buffering close
to 500MB of ExportResponse_Files (600k * 790bytes) on the node
serving the ExportRequest. This was then shipped over grpc to the
client where we were opening a multi-mem iterator over these 600k+ empty
files causing a node with 15GiB RAM to OOM.

Since fingerprinting does not set a `TargetBytes` on the export requests
it sends, in a cluster with more than one ranage there could be several
requests concurrently buffering these empty files, resulting in an even
more pronounced memory blowup.

This change ensures that the MemFile is nil'ed out if there
are no rangekeys encountered during the fingerprinting.

This change also returns nil values for the `BulkOpSummary`
and `Span` field of the `ExportResponse` when generated as
part of fingerprinting. These fields are not used and show
up in memory profiles of the c2c/kv0 roachtest.

Release note: None

Informs: #93078

95100: multitenant: support crdb_internal.list_sql_keys_in_range() for secondary tenants r=knz,arulajmani a=ecwall

Fixes #95006

The crdb_internal.list_sql_keys_in_range builtin now uses the RangeDescIterator
and scopes the range span by the tenant span to prevent these errors:
```
ERROR: RangeIterator failed to seek to /Meta2/"\x00": rpc error: code = Unauthenticated desc = requested key /Meta2/"\x00" not fully contained in tenant keyspace /Tenant/1{2-3}
```
    
----

Fixes #92072

The last range of the last tenant's end key is `/Max` which is outside of the
tenant's key space.

This is fixed by adding a split at the end of the tenant's key space when the
tenant is created. For example, when tenant 10 is created, a split is added at
/tenant/11.

Release note: None

95276: *: stop swallowing errors from privilege checks r=ecwall a=rafiss

sql: add HasPrivilege and HasAnyPrivilege to AuthorizationAccessor
*: stop swallowing errors from CheckPrivilegeForUser
*: stop swallowing errors from CheckAnyPrivilege
*: stop swallowing errors from CheckPrivilege
server: don't swallow error in hasGlobalPrivilege

Instead of swallowing errors, we use the new Has*Privilege functions. If
there was an error that causes the transaction to abort, it's important
to propagate it.

Epic: None
Release note: None

95482: sql: fix a race with finishing a span and executing RPCs r=yuzefovich a=yuzefovich

This commit fixes a race that was introduced when we fully parallelized the execution of the setup of the remote flows. In particular, it became possible for a `setup-flow-async` "runner" tracing span be finished before a concurrent `SetupFlow` RPC is issued, which uses that span as a parent for the RPC span. The issue is only present in the case that we didn't spin up a "listener" goroutine for whatever reason (most likely because the server is quescing), in which case we didn't block to wait for the concurrent RPCs to be evaluated. We canceled the context of those RPCs, but it appears that the context cancellation is not immediately noticed by the RPC call (or at least it is noticed _after_ the gRPC interceptor code attempt to create a new tracing span). This is now fixed by blocking the cleanup until all concurrently-issued RPCs are completed.

Fixes: #92809.

Release note: None

Co-authored-by: Jordan Lewis <[email protected]>
Co-authored-by: adityamaru <[email protected]>
Co-authored-by: Evan Wall <[email protected]>
Co-authored-by: Rafi Shamim <[email protected]>
Co-authored-by: Yahor Yuzefovich <[email protected]>
@craig craig bot closed this as completed in 5cce084 Jan 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-kv-observability A-multitenancy Related to multi-tenancy A-sql-builtins SQL built-in functions and semantics thereof. C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants