Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

server: hardcode descriptors in rbr migration #99074

Closed
jeffswenson opened this issue Mar 20, 2023 · 0 comments · Fixed by #99172
Closed

server: hardcode descriptors in rbr migration #99074

jeffswenson opened this issue Mar 20, 2023 · 0 comments · Fixed by #99172
Assignees
Labels
A-multiregion Related to multi-region C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. T-shared-systems Shared Systems Team

Comments

@jeffswenson
Copy link
Collaborator

jeffswenson commented Mar 20, 2023

Once the beta version branch is cut, we need to hardcode the system descriptors in the system_rbr_indexes migration. Otherwise the behavior of the migrations may change if the system.sqlliveness, system.sql_instances, or system.lease tables are modified.

The migrations were added as part of ##94843.

Epic: CRDB-18596

Jira issue: CRDB-25690

@jeffswenson jeffswenson added the C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. label Mar 20, 2023
@jeffswenson jeffswenson self-assigned this Mar 20, 2023
@jeffswenson jeffswenson added A-multiregion Related to multi-region T-shared-systems Shared Systems Team labels Mar 20, 2023
craig bot pushed a commit that referenced this issue Mar 22, 2023
…99239 #99263 #99278

98980: kvcoord: Add metric to keep track of restarted ranges in rangefeed r=miretskiy a=miretskiy

Add a `distsender.rangefeed.restart_ranges` metric to keep track of the number of ranges restarted due to transient error.

Epic: CRDB-25044
Release note: None

99069: storage/cloud: correct the flag name in implicit credentials error message r=rhu713 a=taroface

When `--external-io-disable-implicit-credentials` is set and the user issues a command with `AUTH=implicit`, the resulting error message has the wrong flag name (`disable` is left out). Searching for that flag name in the docs doesn't return any results. The flag name is corrected in this PR.

Release note: none
Release justification: CLI bug

99077: changefeedccl: Allow timeout override r=miretskiy a=miretskiy

Add timeout URL parameter for schema registry URIs. Prior to this change, all schema registry calls used default time out of 3 seconds.  This PR increases the timeout to 30 seconds, and allows timeout to be specified via `timeout=T` URL parameter.

Informs https://github.com/cockroachlabs/support/issues/2173

Release note (enterprise change): AVRO schema registry URI allow additional `timeout=T` query parameter to change the default timeout for contacting schema registry.

99141: storage: CheckSSTConflict fix for nexting over overlapping points r=jbowens a=itsbilal

Previously, the nexting logic around both iterators being at a range key and not a point key was flawed in that we'd miss ext points that were in between
the current and next sst keys, when we'd next both of them. This change addresses that.

It also addresses other miscellaneous corner cases around stats calculations with overlapping sst/engine range keys and point keys. All these bugs were found with the upcoming CheckSSTConflicts randomized test in #98408.

Epic: none

Release note: None

99146: opt: speed up lookup constraint builder r=mgartner a=mgartner

#### opt: add benchmark with many lookup joins

This commit adds an optimizer benchmark that explores many lookup joins.
It explores many potential lookup joins that do not ultimately get added
to the memo, as well as many lookup joins that do get added to the memo.

Release note: None

#### opt: split HasSingleColumnConstValues into two functions

This commit splits HasSingleColumnConstValues into two functions - one
that returns a boolean if a constraint set constrains a single column to
a set of constant, non-null values, and another function that returns
the constant values. The former is more efficient when the only the
boolean is needed.

Release note: None

#### opt: simplify lookup join constraint builder

This commit reduces computation and allocations when attempting to
build lookup join constraints by performing a simple column ID equality
before more complex computations and allocations.

Release note: None

#### opt: reduce allocations when building lookup join constraints

During the construction of lookup join constraints, two allocations of a
`opt.ColList` have been combined into a single allocation, and
allocation of a `memo.FiltersExpr` to store remaining filters is now
only performed if necessary.

Release note: None

These changes offer a nice speedup for the newly added benchmark:

```
name                         old time/op    new time/op    delta
SlowQueries/slow-query-1-10    15.8ms ± 1%    15.7ms ± 1%     ~     (p=0.690 n=5+5)
SlowQueries/slow-query-2-10     220ms ± 0%     219ms ± 0%     ~     (p=0.095 n=5+5)
SlowQueries/slow-query-3-10    63.0ms ± 1%    62.4ms ± 0%   -0.98%  (p=0.008 n=5+5)
SlowQueries/slow-query-4-10     1.70s ± 1%     1.38s ± 0%  -19.22%  (p=0.008 n=5+5)

name                         old alloc/op   new alloc/op   delta
SlowQueries/slow-query-1-10    7.04MB ± 0%    6.98MB ± 0%   -0.79%  (p=0.008 n=5+5)
SlowQueries/slow-query-2-10    48.7MB ± 0%    48.7MB ± 0%   -0.11%  (p=0.008 n=5+5)
SlowQueries/slow-query-3-10    45.1MB ± 0%    44.9MB ± 0%   -0.55%  (p=0.008 n=5+5)
SlowQueries/slow-query-4-10     878MB ± 0%     737MB ± 0%  -16.03%  (p=0.008 n=5+5)

name                         old allocs/op  new allocs/op  delta
SlowQueries/slow-query-1-10     76.1k ± 0%     75.8k ± 0%   -0.38%  (p=0.008 n=5+5)
SlowQueries/slow-query-2-10      401k ± 0%      400k ± 0%   -0.25%  (p=0.008 n=5+5)
SlowQueries/slow-query-3-10      390k ± 0%      389k ± 0%   -0.21%  (p=0.008 n=5+5)
SlowQueries/slow-query-4-10     18.2M ± 0%     17.4M ± 0%   -4.44%  (p=0.008 n=5+5)
```

Epic: None


99154: ui: stop polling in stmt fingerprint details page, change default sort on stmts r=maryliag a=xinhaoz

See individual commits.

https://www.loom.com/share/17569db4a0c04a968dabbc4421d429bf

99169: kv: unflake TestDelegateSnapshot r=kvoli a=andrewbaptist

Fixes: #96841
Fixes: #96525

Previously this test would assume that all snapshots came from the sending of snapshots through the AdminChangeReplicasRequest which end up as type OTHER. However occassionally we get a spurious raft snapshot which makes this test flaky. This change ignores any raft snapshots that are sent.

Epic: none
Release note: None

99172: upgrades: hardcode descriptors in system_rbr_indexes r=JeffSwenson a=JeffSwenson

Previously, if a change was made to the system.sql_instances, system.lease, or system.sqlliveness bootstrap schema, it would change the behavior of the upgrade attached to the V23_1_SystemRbrReadNew version gate.

Now, the content of the descriptors is hard coded in the upgrade so that the behavior is not accidentally changed in the future.

Fixes: #99074

Release note: None

99180: builtins: add builtin functions which cast to OID to the distSQL block list r=michae2,cucaroach a=msirek

Distributed SQL which executes functions or casts to OID rely on `planner` receiver functions to execute internal SQL to get information about the OID from system tables. If these casts occur on a remote processor, the `planner` is not accessible and a dummy planner is used, which does not implement these receiver functions. To prevent internal errors, these casts or problem functions are added to a distSQL block list by `distSQLExprCheckVisitor`. A cast to an OID can also be done via a builtin function of the same name as the target type, e.g. `regproc`. These builtins do not currently have `DistsqlBlocklist` set, allowing distributed execution.

The solution is to mark `DistsqlBlocklist` as true for any builtin function which casts to an OID type.

Fixes #98373

Release note: None

99239: appstats: fix percentile greater than max latency r=maryliag a=maryliag

Part Of #99070
When an execution happens, its latency is added to a stream and then ordered so percentiles can be queried.
When getting the percentile values, we don't have the timestamp of when each value was added, meaning when we query the stream we could be getting values from a previous aggregation timestamp, if the current windows has very few executions (the stream has a limit, so we only have the most recent execution, but if the statement is not run frequently this stream can have old data).

The way this information is stored will need to be changed, but for now a patchy solution was added so we don't have the case where we show percentiles greater than the actual max.

Release note (bug fix): Add a check so percentiles are never greater than the max latency value.

99263: roachtest: copyfrom fix cluster package install r=aliher1911 a=aliher1911

When installing packages, look on cluster remoteness as proxy for arch instead of roachtest runtime which is runs different arch.

Epic: none

Release note: None

99278: sql: fix update helper optional from clause r=rytaft a=lyang24

fixes #98662

sanity testing:
output
<img width="265" alt="Screen Shot 2023-03-22 at 1 01 10 PM" src="https://user-images.githubusercontent.com/20375035/227027849-34f34bb4-d52b-4de4-8a5b-456ee8b27f1b.png">
sample sql
<img width="874" alt="Screen Shot 2023-03-22 at 1 10 15 PM" src="https://user-images.githubusercontent.com/20375035/227027874-3f6515f4-fdbe-4c64-9d6f-03eb9b5c67f3.png">


Release note (sql change): fix helper message on update sql to correctly position the optional from cause.

Co-authored-by: Yevgeniy Miretskiy <[email protected]>
Co-authored-by: Ryan Kuo <[email protected]>
Co-authored-by: Bilal Akhtar <[email protected]>
Co-authored-by: Marcus Gartner <[email protected]>
Co-authored-by: Xin Hao Zhang <[email protected]>
Co-authored-by: Andrew Baptist <[email protected]>
Co-authored-by: Jeff <[email protected]>
Co-authored-by: Mark Sirek <[email protected]>
Co-authored-by: maryliag <[email protected]>
Co-authored-by: Oleg Afanasyev <[email protected]>
Co-authored-by: Eric.Yang <[email protected]>
@craig craig bot closed this as completed in b292729 Mar 22, 2023
jeffswenson added a commit to jeffswenson/cockroach that referenced this issue Apr 6, 2023
Previously, if a change was made to the system.sql_instances,
system.lease, or system.sqlliveness bootstrap schema, it would change
the behavior of the upgrade attached to the V23_1_SystemRbrReadNew
version gate.

Now, the content of the descriptors is hard coded in the upgrade so that
the behavior is not accidentally changed in the future.

Fixes: cockroachdb#99074

Release note: None
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-multiregion Related to multi-region C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. T-shared-systems Shared Systems Team
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant