-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
opt: add support for reverse scans #27110
Conversation
pkg/sql/opt/exec/execbuilder/testdata/distsql_numtables, line 146 at r1 (raw file):
I wasn't entirely sure about this change. It seemed to make sense, but I'm confused by as to why there is no annotation that the scan is a reverse scan. Additionally, I'm confused as to why there's only a scan on one node. Comments from Reviewable |
[nit] fix typo in commit message (support logical support) Looks good to me, but I'd wait for Radu to comment on that strange DistSQL plan... Reviewed 12 of 12 files at r1, 1 of 1 files at r2. pkg/sql/opt/memo/memo_format.go, line 280 at r1 (raw file):
To avoid cluttering the test files, I'd only include this if pkg/sql/opt/memo/private_defs.go, line 73 at r1 (raw file):
[nit] extra line pkg/sql/opt/memo/private_defs.go, line 114 at r1 (raw file):
You could do something like But I think your approach might actually be better since it's very clearly correct. Up to you.... pkg/sql/opt/memo/private_defs.go, line 115 at r1 (raw file):
[nit] extra line Comments from Reviewable |
f05331f
to
384f21b
Compare
TFTR! Fixed the typo, sounds good on waiting to discuss the DistSQL plan. Review status: complete! 0 of 0 LGTMs obtained pkg/sql/opt/memo/memo_format.go, line 280 at r1 (raw file): Previously, rytaft wrote…
Done. pkg/sql/opt/memo/private_defs.go, line 73 at r1 (raw file): Previously, rytaft wrote…
Done. pkg/sql/opt/memo/private_defs.go, line 114 at r1 (raw file): Previously, rytaft wrote…
Yea I see what you're saying. However, at the risk of being too verbose, the current if statement is easier for me to understand. pkg/sql/opt/memo/private_defs.go, line 115 at r1 (raw file): Previously, rytaft wrote…
Done. Comments from Reviewable |
We probably want to cost this as a bit more expensive than a normal scan, I think? I remember hearing anecdotally that they're ~3-5x slower than forward scans in RocksDB. That's definitely a concern that can be pushed down the road though (maybe consult Peter about the relative rate, I remember him talking about it a while back). Review status: complete! 0 of 0 LGTMs obtained pkg/sql/opt/xform/custom_funcs.go, line 86 at r3 (raw file):
up to you, but since pkg/sql/opt/xform/custom_funcs.go, line 109 at r3 (raw file):
same here pkg/sql/opt/xform/testdata/rules/scan, line 36 at r3 (raw file):
If there was an Comments from Reviewable |
The performance diff depends on the number of historic versions. Assuming 2-3x slower seems reasonable (below, old is forward scan and new is reverse scan).
Review status: complete! 0 of 0 LGTMs obtained Comments from Reviewable |
This looks great! Nice work! Yeah, I would add a 2x factor for reverse scans (grep for Now that we have revscans, the next thing you can do (in another PR) is to make a rule analogous to Review status: complete! 0 of 0 LGTMs obtained pkg/sql/opt/exec/execbuilder/testdata/distsql_numtables, line 146 at r1 (raw file): Previously, madhavsuresh wrote…
This looks goo. I checked and there is no annotation for reverse vs non-reverse. It runs on one node because the limit gets pushed into the TableReader and it likely will only need to get results from that one node. Note that the node is correct, it is node 5 which stores the last ranges. Note that TableReaders can do remote KVs if some rows are not on that node (it was an explicit design decision to not require strict placement for correctness). BTW you can also check out the "old planner" versions of these tests: https://github.com/cockroachdb/cockroach/blob/master/pkg/sql/logictest/testdata/planner_test/distsql_numtables#L141 pkg/sql/opt/xform/custom_funcs.go, line 127 at r3 (raw file):
[nit] maybe call Comments from Reviewable |
Reviewed 3 of 3 files at r3. pkg/sql/opt/memo/private_defs.go, line 114 at r1 (raw file): Previously, madhavsuresh wrote…
👍 Comments from Reviewable |
TFTR! Added cost for reverse scans. After changing the cost, reverse scans don't seem favorable at all. This makes sense when the sort is in memory, but goes against my intuition for disk sorts. Review status: complete! 0 of 0 LGTMs obtained pkg/sql/opt/xform/custom_funcs.go, line 86 at r3 (raw file): Previously, justinj (Justin Jaffray) wrote…
Done. pkg/sql/opt/xform/custom_funcs.go, line 109 at r3 (raw file): Previously, justinj (Justin Jaffray) wrote…
Done. pkg/sql/opt/xform/custom_funcs.go, line 127 at r3 (raw file): Previously, RaduBerinde wrote…
Done. pkg/sql/opt/xform/testdata/rules/scan, line 36 at r3 (raw file): Previously, justinj (Justin Jaffray) wrote…
You're right, thanks for catching this. Should update it to, there's no index with "f asc, k desc" Comments from Reviewable |
Yea, we haven't added costing for disk sorts yet. Maybe when you add the rule with MAX + limit 1, reverse scans will be favorable. Comments from Reviewable |
Review status: complete! 0 of 0 LGTMs obtained pkg/sql/opt/xform/coster.go, line 132 at r4 (raw file):
Radu should weigh in here, but I don't think the multiplier should be applied to the IO cost, but instead to the CPU cost (i.e. Comments from Reviewable |
This looks good to me. It looks that it is favored in the important cases - the ones where it allows LIMIT to be pushed into the scan. Perhaps for large tables (with stats) revscan will be favored over forward scan + sort (because of the Review status: complete! 0 of 0 LGTMs obtained Comments from Reviewable |
Review status: complete! 0 of 0 LGTMs obtained pkg/sql/opt/xform/coster.go, line 132 at r4 (raw file): Previously, andy-kimball (Andy Kimball) wrote…
In practice it is 2-3x more expensive overall, so a 2x factor for everything seems fine to me. I think the IO pattern is less efficient in this case, or we wouldn't see that big of a difference. Note that a rev scan is not merely a forward scan + a reverse in-memory iteration. You can do a rev scan on a huge table to get just the last few rows and this doesn't read the entire table. Comments from Reviewable |
Review status: complete! 0 of 0 LGTMs obtained pkg/sql/opt/xform/coster.go, line 132 at r4 (raw file): Previously, RaduBerinde wrote…
I based this off the table that Peter provided saying that there's ~2x performance penalty overall. Similar to what Radu said, the performance hit seemed, on average, a total factor of 2x. There is probably some expression which is a more accurate representation of the reverse scan cost - one that takes into account the CPU cost and the IO pattern independently. That seems challenging to construct given the limited information we have. Comments from Reviewable |
Review status: complete! 0 of 0 LGTMs obtained pkg/sql/opt/xform/coster.go, line 132 at r4 (raw file): Previously, madhavsuresh wrote…
In the benchmark numbers I posted, the perf diff between a forward and reverse scan is entirely CPU. The amount of IO performed is the same. Comments from Reviewable |
Review status: complete! 0 of 0 LGTMs obtained pkg/sql/opt/xform/coster.go, line 132 at r4 (raw file): Previously, petermattis (Peter Mattis) wrote…
Got it, thanks for clarifying! In that case, it makes sense to push in the multiplier to the rowScanCost as @andy-kimball mentioned. The updated plans use the reverse scan in more cases. Comments from Reviewable |
17223f7
to
1890280
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ping @rytaft, @RaduBerinde
Reviewable status: complete! 0 of 0 LGTMs obtained
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 20 of 20 files at r5.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale)
pkg/sql/opt/xform/custom_funcs.go, line 126 at r5 (raw file):
} memoDef := c.e.mem.InternIndexJoinDef(&def)
[super nit] I think by convention we've been calling this private
instead of memoDef
The optimizer does not generate plans which include reverse scans. This PR adds logical support for reverse scans. Release note: None
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bors r+
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale)
pkg/sql/opt/xform/custom_funcs.go, line 126 at r5 (raw file):
Previously, rytaft wrote…
[super nit] I think by convention we've been calling this
private
instead ofmemoDef
Done.
👎 Rejected by code reviews |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bors r+
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale)
👎 Rejected by code reviews |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dismissed @justinj from 3 discussions.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale)
the comments were resolved, that this is unresolved seems to be a bug.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bors r+
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale)
27110: opt: add support for reverse scans r=madhavsuresh a=madhavsuresh The optimizer does not generate plans which include reverse scans. This PR adds support logical support for reverse scans. Release note: None 27145: ui: depend on babel-polyfill r=couchand a=benesch babel-polyfill is the standard means of importing the necessary set of polyfills for the configured babel preset. We were previously forced to manually importing the polyfills we needed because the combination of Node 6, JSDom, and babel-polyfill was broken. Since we're no longer using JSDom (see 7e89701), we can use babel-polyfill. Release note: None 27204: sql/parser: unreserve VIEW r=knz a=knz Found while looking at #26993. VIEW does not need to be a reserved keyword. In fact, it sounds innocuous enough for users to want to use it as a column name. So make it an unreserved keyword. Release note (sql change): CockroachDB now allows clients to use the word `view` as identifier like in PostgreSQL. Co-authored-by: madhavsuresh <[email protected]> Co-authored-by: Nikhil Benesch <[email protected]> Co-authored-by: Raphael 'kena' Poss <[email protected]>
Build succeeded |
This PR is a follow up to cockroachdb#26905 and cockroachdb#27110. It adds an exploration rule to replace MAX(x) with LIMIT 1 on a sorted index. Given the a ddition of reverse scans to the optimizer in cockroachdb#27110, the MAX(x) -> LIMIT 1 transformation is now feasible. Note that the exploration rule ReplaceMinWithLimit is effectively copied over to ReplaceMaxWithLimit. Ideally OpName would have been used and the rule would be in the form below. However, due to limitations of OpName in explorations rules, this was not feasible. ``` [ReplaceMinMaxWithLimit, Explore] (GroupBy $input:* (Aggregations [$op:(Max|Max $variable:(Variable $col:*))] $cols:*) $def:* & (IsScalarGroupBy $def) ) => (GroupBy (Limit (Select $input (Filters [(IsNot $variable (Null (AnyType)))]) ) (MakeOne) (MakeOrderingChoiceFromColumn (OpName $op) $col) ) (Aggregations [(AnyNotNull $variable)] $cols) $def ) ```
This PR is a follow up to cockroachdb#26905 and cockroachdb#27110. It adds an exploration rule to replace MAX(x) with LIMIT 1 on a sorted index. Given the a ddition of reverse scans to the optimizer in cockroachdb#27110, the MAX(x) -> LIMIT 1 transformation is now feasible. Note that the exploration rule ReplaceMinWithLimit is effectively copied over to ReplaceMaxWithLimit. Ideally OpName would have been used and the rule would be in the form below. However, due to limitations of OpName in explorations rules, this was not feasible. ``` [ReplaceMinMaxWithLimit, Explore] (GroupBy $input:* (Aggregations [$op:(Max|Max $variable:(Variable $col:*))] $cols:*) $def:* & (IsScalarGroupBy $def) ) => (GroupBy (Limit (Select $input (Filters [(IsNot $variable (Null (AnyType)))]) ) (MakeOne) (MakeOrderingChoiceFromColumn (OpName $op) $col) ) (Aggregations [(AnyNotNull $variable)] $cols) $def ) ```
This PR is a follow up to cockroachdb#26905 and cockroachdb#27110. It adds an exploration rule to replace MAX(x) with LIMIT 1 on a sorted index. Given the addition of reverse scans to the optimizer in cockroachdb#27110, the MAX(x) -> LIMIT 1 transformation is now feasible. Note that the exploration rule ReplaceMinWithLimit is effectively copied over to ReplaceMaxWithLimit. Ideally OpName would have been used and the rule would be in the form below. However, due to limitations of OpName in explorations rules, this was not feasible. ``` [ReplaceMinMaxWithLimit, Explore] (GroupBy $input:* (Aggregations [$op:(Max|Max $variable:(Variable $col:*))] $cols:*) $def:* & (IsScalarGroupBy $def) ) => (GroupBy (Limit (Select $input (Filters [(IsNot $variable (Null (AnyType)))]) ) (MakeOne) (MakeOrderingChoiceFromColumn (OpName $op) $col) ) (Aggregations [(AnyNotNull $variable)] $cols) $def ) ```
This PR is a follow up to cockroachdb#26905 and cockroachdb#27110. It adds an exploration rule to replace MAX(x) with LIMIT 1 on a sorted index. Given the addition of reverse scans to the optimizer in cockroachdb#27110, the MAX(x) -> LIMIT 1 transformation is now feasible. Note that the exploration rule ReplaceMinWithLimit is effectively copied over to ReplaceMaxWithLimit. Ideally OpName would have been used and the rule would be in the form below. However, due to limitations of OpName in explorations rules, this was not feasible. ``` [ReplaceMinMaxWithLimit, Explore] (GroupBy $input:* (Aggregations [$op:(Max|Max $variable:(Variable $col:*))] $cols:*) $def:* & (IsScalarGroupBy $def) ) => (GroupBy (Limit (Select $input (Filters [(IsNot $variable (Null (AnyType)))]) ) (MakeOne) (MakeOrderingChoiceFromColumn (OpName $op) $col) ) (Aggregations [(AnyNotNull $variable)] $cols) $def ) ```
This PR is a follow up to cockroachdb#26905 and cockroachdb#27110. It adds an exploration rule to replace MAX(x) with LIMIT 1 on a sorted index. Given the addition of reverse scans to the optimizer in cockroachdb#27110, the MAX(x) -> LIMIT 1 transformation is now feasible. Note that the exploration rule ReplaceMinWithLimit is effectively copied over to ReplaceMaxWithLimit. Ideally OpName would have been used and the rule would be in the form below. However, due to limitations of OpName in explorations rules, this was not feasible. ``` [ReplaceMinMaxWithLimit, Explore] (GroupBy $input:* (Aggregations [$op:(Max|Max $variable:(Variable $col:*))] $cols:*) $def:* & (IsScalarGroupBy $def) ) => (GroupBy (Limit (Select $input (Filters [(IsNot $variable (Null (AnyType)))]) ) (MakeOne) (MakeOrderingChoiceFromColumn (OpName $op) $col) ) (Aggregations [(AnyNotNull $variable)] $cols) $def ) ```
27286: opt: exploration rule for MAX(x) on index r=madhavsuresh a=madhavsuresh This PR is a follow up to #26905 and #27110. It adds an exploration rule to replace MAX(x) with LIMIT 1 on a sorted index. Given the a ddition of reverse scans to the optimizer in #27110, the MAX(x) -> LIMIT 1 transformation is now feasible. Note that the exploration rule ReplaceMinWithLimit is effectively copied over to ReplaceMaxWithLimit. Ideally OpName would have been used and the rule would be in the form below. However, due to limitations of OpName in explorations rules, this was not feasible. ``` [ReplaceMinMaxWithLimit, Explore] (GroupBy $input:* (Aggregations [$op:(Max|Max $variable:(Variable $col:*))] $cols:*) $def:* & (IsScalarGroupBy $def) ) => (GroupBy (Limit (Select $input (Filters [(IsNot $variable (Null (AnyType)))]) ) (MakeOne) (MakeOrderingChoiceFromColumn (OpName $op) $col) ) (Aggregations [(AnyNotNull $variable)] $cols) $def ) ``` Co-authored-by: madhavsuresh <[email protected]>
This PR is a follow up to cockroachdb#26905 and cockroachdb#27110. It adds an exploration rule to replace MAX(x) with LIMIT 1 on a sorted index. Given the addition of reverse scans to the optimizer in cockroachdb#27110, the MAX(x) -> LIMIT 1 transformation is now feasible. Note that the exploration rule ReplaceMinWithLimit is effectively copied over to ReplaceMaxWithLimit. Ideally OpName would have been used and the rule would be in the form below. However, due to limitations of OpName in explorations rules, this was not feasible. ``` [ReplaceMinMaxWithLimit, Explore] (GroupBy $input:* (Aggregations [$op:(Max|Max $variable:(Variable $col:*))] $cols:*) $def:* & (IsScalarGroupBy $def) ) => (GroupBy (Limit (Select $input (Filters [(IsNot $variable (Null (AnyType)))]) ) (MakeOne) (MakeOrderingChoiceFromColumn (OpName $op) $col) ) (Aggregations [(AnyNotNull $variable)] $cols) $def ) ```
The optimizer does not generate plans which include reverse scans.
This PR adds support logical support for reverse scans.
Release note: None