-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: remove unnecessary index joins in partial index validation queries #70116
Labels
C-bug
Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.
E-quick-win
Likely to be a quick win for someone experienced.
T-sql-queries
SQL Queries Team
Comments
mgartner
added
C-bug
Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.
E-quick-win
Likely to be a quick win for someone experienced.
labels
Sep 13, 2021
craig bot
pushed a commit
that referenced
this issue
Sep 14, 2021
67737: opt: zigzag scan hints r=cucaroach a=cucaroach Fixes #35814 Enable table@{FORCE_ZIGZAG} and table@{FORCE_ZIGZAG=index} hints to lower the cost of zigzag plans to that they will be preferred over other plans. Note that this is a breaking change, if customer has an index called FORCE_ZIGZAG and is hinting with table@{FORCE_ZIGZAG} that will no longer do what it did before. Not sure what the ramifications of that are. Release note (sql change): Extend index scan hint to allow zigzag joins to be preferred. 69887: kv,migration: rm code handling legacy raft truncated state r=irfansharif a=irfansharif Fixes #66544. Fixes #66834. We fully migrated away from the replicated truncated state and the old range applied state keys (RaftAppliedIndexLegacyKey, LeaseAppliedIndexLegacyKey, and RangeStatsLegacyKey) in #58088. We're now guaranteed to never see the legacy truncated+applied state representations, making this code (finally) safe to delete. Release justification: cluster version cleanup Release note: None --- This turned out to be pretty hairy and invasive. I'll wait for a few CI runs and inspect the diffs here a bit more closely. Given the surface area here, I'm curious if we feel this is worth getting in during stability. Some questions to consider: - The various legacy keys that were removed were removed in their entirety. Our CLI to scan over store keys will no longer recognize these keys. Is that ok? - I don't think we need any checks to make sure we're not re-using some of these keys in the future -- but I'd like other opinions. - Looks like we re-purposed the LocalRaftTruncatedStateLegacySuffix for the unreplicated truncated key as well, so just renamed it more appropriately. - v21.1 nodes have to interop with v20.2 nodes where the legacy representation was possible, and it identified such cases through booleans in proto messages (SnapshotRequest.Header.UseUnreplicatedTruncatedState for example). This means that we still need to explicitly set them in v21.2 code. Crucially we'll stop reading these fields in v21.2 code, letting us remove these fields entirely in v22.1. - The treatment above does not extend to ReplicaState.UsingAppliedStateKey; that field was only being used to trigger an idempotent migration to start using the newer applied state key representation, not as a way to express that the replica should expect the legacy representation. Since [#58088](#58088) has already migrated every range in the system, we no longer need to trigger anything, and this field can be removed. 70059: roachtest: make --workload optional r=mgartner a=mgartner This commit makes the `--workload` flag optional because some roachtests do not require a workload binary. Release note: None 70120: opt: match ScalarGroupBy in EliminateIndexJoinOrProjectInsideGroupBy r=mgartner a=mgartner This commit extends EliminateIndexJoinOrProjectInsideGroupBy so that ScalarGroupBy expressions are also matched. This allows the rule to eliminate unnecessary index joins in more cases. The primary motivation for this change was to make partial index validation queries more efficient. These queries always have ScalarGroupBy expressions because they are in the form: SELECT count(1) FROM table@partial_index WHERE predicate Prior to this change, an index join was planned for these queries which would operate on every row in the partial index. This could be extremely expensive for large partial indexes. Fixes #70116 Release note (performance improvement): A limitation has been fixed that made creating partial indexes inefficient. 70162: ui: fix default behaviours of columns on stmt page on cc console r=maryliag a=maryliag When a CC Console user open the Statement page for the first time (no cache was created for column selector), this commits make sure that the default columns will be displayed. Fixes: #70160 Release justification: Category 4 Release note (bug fix): Default columns being displayed on Statements page on CC console when the user never made any selection Co-authored-by: Tommy Reilly <[email protected]> Co-authored-by: irfan sharif <[email protected]> Co-authored-by: Marcus Gartner <[email protected]> Co-authored-by: Marylia Gutierrez <[email protected]>
blathers-crl bot
pushed a commit
that referenced
this issue
Sep 14, 2021
This commit extends EliminateIndexJoinOrProjectInsideGroupBy so that ScalarGroupBy expressions are also matched. This allows the rule to eliminate unnecessary index joins in more cases. The primary motivation for this change was to make partial index validation queries more efficient. These queries always have ScalarGroupBy expressions because they are in the form: SELECT count(1) FROM table@partial_index WHERE predicate Prior to this change, an index join was planned for these queries which would operate on every row in the partial index. This could be extremely expensive for large partial indexes. Fixes #70116 Release note (performance improvement): A limitation has been fixed that made creating partial indexes inefficient.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
C-bug
Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.
E-quick-win
Likely to be a quick win for someone experienced.
T-sql-queries
SQL Queries Team
A specific case of #54588 which is particularly impactful is when unnecessary index joins occur for the partial index validation query performed when adding a new partial index. This type of query results in an index join operating on every row in the partial index.
For example:
We should add an exploration rule to eliminate these type of index joins so that adding a partial index to a large table is not significantly more expensive than adding a non-partial index.
The workaround for now is to create a partial index that stores all the columns that are referenced in its predicate.
The text was updated successfully, but these errors were encountered: