Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Browse files
Browse the repository at this point in the history
110870: sql: anonymize function names for error reporting and telemetry r=rharding6373 a=rharding6373 Previously, we did not anonymize/redact function names because we only supported built-in, CRDB-named functions. However, since adding support for UDFs, function names may be user-provided. We need to omit user-provided names in error reporting and telemetry. We fix this oversight in this PR by no longer overriding the anonymizing format flag and redaction flag for FuncExprs. Unfortunately, since any functions in a query are not always resolved when the error occurs, we may also redact built-in function names. Epic: None Fixes: None Release note (sql change): Anonymizes function names in error reporting and telemetry so that user-provided names for UDFs are not included. 111447: kvcoord: provide local node ID to DistSender in shared-process mode r=yuzefovich a=yuzefovich This commit makes it so that the DistSender of SQL pods running in shared-process mode have access to the node ID of the local KV node. The node ID is used to preferentially route requests to replicas available on the local node. Previously, this optimization was only available in single-tenant mode, and now it'll be also available in shared-process multi-tenant. The optimization doesn't exist in separate-process multi-tenant, so in that mode we continue to return `0` as the node ID which makes it so that replicas are picked randomly. For testing, this commit extends existing `TestSecondaryTenantFollowerReadsRouting` to also have a shared-process test configuration. There were a few gotchas in that adjustment: - for simplicity, we need to create a single region cluster so that latency and locality don't come into consideration when choosing the replica (we want to make sure that the local replica is used if available) - starting shared-process virtual cluster on a multi node cluster can be racy if we want to provide testing knobs, so we have to start the service on the gateway node first. (If we don't start with the gateway, then the server controller on the gateway could start the tenant server with empty knobs _before_ we explicitly start the service there.) - previously the test relied on retrieving the follower reads metric counters from all nodes and asserting only for `n2` it increased. However, in shared-process config it is now possible for an internal query (e.g. issued by auto stats collection) to be served via follower reads too, which would increment the counter confusing the test. As a result, this commit adjusts the test to audit the trace of the query and verify that the single follower read was served specifically by `n2`. Fixes: #109522. Release note: None 111798: cli,server: allow user to specify port range for secondary VCs r=knz a=stevendanna Previously, the user could set a port range offset that would instruct the tenant controller to choose a gRPC port starting at the KV server's RPC port + the offset + an index. While this provides the user some predictability, there is no upper bound on the port that could be picked. Here, we add a new option --secondary-tenant-port-range That allows the user to specify an explicit range. Fixes #111848. Epic: CRDB-26691 Release note: None 111822: sql: fast path uniqueness checks for single-row insert r=mgartner a=msirek This adds support for building and executing simple INSERT statements with a single-row VALUES clause where any required uniqueness constraint checks can be handled via a constrained scan on an index. This includes INSERT cases such as: - a single-row VALUES clause into a REGIONAL BY ROW table with a PRIMARY KEY which has a UUID column generated by default, ie. `id UUID PRIMARY KEY DEFAULT gen_random_uuid()`, where the crdb_region column is not specified in the VALUES clause; either the gateway region is used or it is computed based on other column values. - a single-row VALUES clause into a REGIONAL BY ROW table with a hash sharded unique index where the crdb_region column is not specified in the VALUES clause In optbuild, when creating a uniqueness check for rows which are added to a table, a fast path index check relation is also built when the mutation source is a single-row values expression or WithScan from a single-row values expression. That relation is a filtered Select of a Scan from the target table, where the filters equate all of the unique check columns with their corresponding constants or placeholders from the Values expression. If there is a uniqueness check with a partial index predicate, fast path is disallowed. A new exploration rule called InsertFastPath is added to walk the memo group members created during exploration in `FastPathUniqueChecks` of the `InsertExpr`, to find any which have been rewritten as a constrained `ScanExpr`. If found, that means that Scan fully represents the lookup needed to check for duplicate entries and the Scan constraint can be used to identify the constants to use in a KV lookup on the scanned index in a fast path check. Function CanUseUniqueChecksForInsertFastPath walks the expressions generated during exploration of the `FastPathUniqueChecks.Check` relation. If a constrained scan is found, it is used to build elements of the `FastPathUniqueChecksItemPrivate` structure to communicate to the execbuilder the table and index to use for the check, and the column ids in the insert row to use for building the fast path KV request. In addition, a new `DatumsFromConstraint` field is added, consisting of a ScalarListExpr of TupleExprs specifying the index key, which allows an index key column to be matched with multiple Datums. One insert row may result in more than one KV lookup for a given uniqueness constraint. These items are used to build the `InsertFastPathFKUniqCheck` structure in the execbuilder. The new `FastPathUniqueChecksItemPrivate` is built into a new the corresponding `FastPathUniqueChecksItem`s of a new `FastPathUniqueChecksExpr` and communicated to the caller via return value `newFastPathUniqueChecks`. A small adjustment is made in the coster to make the fast path unique constraint slightly cheaper, so it should always be chosen over the original non-fast path check. Epic: CRDB-26290 Fixes: #58047 Release note (performance improvement): This patch adds support for insert fast-path uniqueness checks on REGIONAL BY ROW tables where the source is a VALUES clause with a single row. This results in a reduction in latency for single-row inserts to REGIONAL BY ROW tables and hash-sharded REGIONAL BY ROW tables with unique indexes. ---- optbuilder: build insert fast-path uniq checks only if all checks can be built This commit avoids unnecessary processing of insert fast path uniqueness checks by modifying the optbuilder to avoid building a `FastPathUniqueChecksExpr` in the Insert expression if one or more of the required `FastPathUniqueChecksItem`s could not be built. Epic: CRDB-26290 Informs: #58047 Release note: None 112051: bulk: disable as-we-fill scatters for PCR r=stevendanna a=adityamaru In #111952 we saw log lines indicating that the SST batcher was attempting to scatter ranges that were bigger than our "max scatter size". This is likely because of the out-of-order nature of PCR's ingestion which means that the batcher is not guaranteed to have an empty RHS when it splits a soon-to-be overfull range. This change disable the as-we-fill scatters for PCR similar to how we disable them for restore. This is okay since the process already pre-splits and scatters the ranges into which the processors will ingest data. Informs: #111952 Release note: None Co-authored-by: rharding6373 <[email protected]> Co-authored-by: Yahor Yuzefovich <[email protected]> Co-authored-by: Steven Danna <[email protected]> Co-authored-by: Mark Sirek <[email protected]> Co-authored-by: adityamaru <[email protected]>
- Loading branch information