Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Browse files
Browse the repository at this point in the history
83213: kvserver: make MVCC GC less disruptive to foreground traffic r=aayushshah15 a=aayushshah15 This commit changes GC requests to no longer declare exclusive latches at their BatchRequest's timestamp. This was already incorrect as explained in #55293. >The first use is broken because we acquire write latches at the batch header's timestamp, which is set to time.Now(), so we're only serializing with reads in the future and all other writes [1]. So we're disruptive to everyone except who we want to serialize with – reads in the past! This commit makes GC requests only declare a non-mvcc exclusive latch over the `RangeGCThresholdKey`. This is correct because: ``` // 1. We define "correctness" to be the property that a reader reading at / // around the GC threshold will either see the correct results or receive an // error. // 2. Readers perform their command evaluation over a stable snapshot of the // storage engine. This means that the reader will not see the effects of a // subsequent GC run as long as it created a Pebble iterator before the GC // request. // 3. A reader checks the in-memory GC threshold of a Replica after it has // created this snapshot (i.e. after a Pebble iterator has been created). // 4. If the in-memory GC threshold is above the timestamp of the read, the // reader receives an error. Otherwise, the reader is guaranteed to see a // state of the storage engine that hasn't been affected by the GC request [5]. // 5. GC requests bump the in-memory GC threshold of a Replica as a pre-apply // side effect. This means that if a reader checks the in-memory GC threshold // after it has created a Pebble iterator, it is impossible for the iterator // to point to a storage engine state that has been affected by the GC // request. ``` As a result, GC requests should now be much less disruptive to foreground traffic since they're no longer redundantly declaring exclusive latches over global keys. Resolves #55293 Release note(performance improvement): MVCC garbage collection should now be much less disruptive to foreground traffic than before. 85231: backupccl: add RESTORE with schema_only r=dt a=msbutler Fixes #83470 Release note (sql change): This pr adds the schema_only flag to RESTORE, allowing a user to run a normal RESTORE, without restoring any user table data. This can be used to quickly validate that a given backup is restorable. A schema_only restore runtime is O(# of descriptors) which is a fraction of a regular restore's runtime O(# of table rows). Note that during a cluster level, schema_only restore, the system tables are read from S3 and written to disk, as this provides important validation coverage without much runtime cost (system tables should not be large). After running a successful schema_only RESTORE, the user can revert the cluster to its pre-restore state by simply dropping the descriptors the schema_only restore added (e.g. if the user restored a database, they can drop the database after the restore completes). Note that in the cluster level case, the restored system data cannot be reverted, this shouldn't matter, as the cluster was empty before hand. For the Backup validation use case, RESTORE with schema_only provides near total validation coverage. In other words, if a user's schema_only RESTORE works, they can be quite confident that a real RESTORE will work. There's one notable place schema_only RESTORE lacks coverage: It doesn't read (or write) from any of the SSTs that store backed up user table data. To ensure a Backup's SSTs are where the RESTORE cmd would expect them to be, a user should run SHOW BACKUP ... with check_files. Further, in an upcoming patch, another flag for RESTORE validation will be introduced -- the verify_backup_table_data flag -- which extends schema_only functionality to read the table data from S3 and conduct checksums on it. Like with the schema_only flag, no table data will be ingested into the cluster. 85695: colexec: add support for ILIKE and NOT ILIKE r=yuzefovich a=yuzefovich **colexec: clean up NOT LIKE operator generation** This commit cleans up the way we generate operators for NOT LIKE. Previously, they would get their own copy which was exactly the same as for LIKE with a difference in a single line, and now the same underlying operator will handle both LIKE and NOT LIKE - the result of comparison just needs to be negated. The performance hit of this extra boolean comparison is negligible yet we can remove some of the duplicated generated code. ``` name old time/op new time/op delta LikeOps/selPrefixBytesBytesConstOp-24 17.8µs ± 1% 16.9µs ± 0% -4.93% (p=0.000 n=10+10) LikeOps/selSuffixBytesBytesConstOp-24 18.5µs ± 0% 18.7µs ± 0% +1.37% (p=0.000 n=10+10) LikeOps/selContainsBytesBytesConstOp-24 27.8µs ± 0% 28.0µs ± 0% +1.02% (p=0.000 n=9+10) LikeOps/selRegexpBytesBytesConstOp-24 479µs ± 1% 484µs ± 0% +1.10% (p=0.000 n=10+10) LikeOps/selSkeletonBytesBytesConstOp-24 39.9µs ± 0% 40.3µs ± 0% +0.85% (p=0.000 n=10+10) LikeOps/selRegexpSkeleton-24 871µs ± 2% 871µs ± 0% ~ (p=1.000 n=10+10) ``` Release note: None **colexec: add support for ILIKE and NOT ILIKE** This commit adds the native vectorized support for ILIKE and NOT ILIKE comparisons. The idea is simple - convert both the argument and the pattern to capital letters. This required minor changes to the templates to add a "prelude" step of that conversion as well as conversion of the pattern to the upper case during planning. Initially, I generated separate operators for case-insensitive cases, but the benchmarks shown that the performance impact of a single conditional inside of the `for` loop is barely noticeable given that the branch prediction will always be right, so I refactored the existing operators to support case insensitivity. ``` name old time/op new time/op delta LikeOps/selPrefixBytesBytesConstOp-24 16.8µs ± 0% 17.7µs ± 0% +5.30% (p=0.000 n=10+10) LikeOps/selSuffixBytesBytesConstOp-24 18.7µs ± 0% 19.2µs ± 0% +2.99% (p=0.000 n=10+10) LikeOps/selContainsBytesBytesConstOp-24 28.0µs ± 0% 27.8µs ± 0% -0.73% (p=0.000 n=10+10) LikeOps/selRegexpBytesBytesConstOp-24 479µs ± 0% 480µs ± 0% +0.33% (p=0.008 n=9+10) LikeOps/selSkeletonBytesBytesConstOp-24 40.2µs ± 0% 41.4µs ± 0% +3.20% (p=0.000 n=9+10) LikeOps/selRegexpSkeleton-24 860µs ± 0% 857µs ± 0% -0.36% (p=0.023 n=10+10) ``` Addresses: #49781. Release note (performance improvement): ILIKE and NOT ILIKE filters can now be evaluated more efficiently in some cases. 85731: rowexec: allow ordered joinReader to stream matches to the first row r=DrewKimball a=DrewKimball Currently the `joinReaderOrderingStrategy` implementation buffers all looked up rows before matching them with input rows and emitting them. This is necessary because the looked up rows may not be received in input order (which must be maintained). However, rows that match the first input row can be emitted immediately. In the case when there are many rows that match the first input row, this can decrease overhead of the buffer. Additionally, this change can allow a limit to be satisfied earlier, which can significantly decrease latency. This is especially advantageous in the case when there is only one input row, since all lookups can then be rendered and returned in streaming fashion. Release note (performance improvement): The execution engine can now short-circuit execution of lookup joins in more cases, which can decrease latency for queries with limits. 85809: ui: fix time window selection with mouse on Metrics charts r=koorosh a=koorosh This patch fixes an issue that prevents proper time selection with mouse on Metrics charts. The root cause of it is updated time scale object didn't include correct value of `windowSize` that's used to calculate `start` position of time range. Release note (ui change): fix issue with incorrect start time position of selected time range on Metrics page. Resolves: #84001 Co-authored-by: Aayush Shah <[email protected]> Co-authored-by: Michael Butler <[email protected]> Co-authored-by: Yahor Yuzefovich <[email protected]> Co-authored-by: DrewKimball <[email protected]> Co-authored-by: Andrii Vorobiov <[email protected]>
- Loading branch information