Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

opt: add support for reverse scans #27110

Merged
merged 1 commit into from
Jul 9, 2018
Merged

Conversation

madhavsuresh
Copy link

The optimizer does not generate plans which include reverse scans.
This PR adds support logical support for reverse scans.

Release note: None

@madhavsuresh madhavsuresh requested review from rytaft, RaduBerinde, andy-kimball and a team July 2, 2018 21:24
@madhavsuresh madhavsuresh requested a review from a team as a code owner July 2, 2018 21:24
@madhavsuresh madhavsuresh requested a review from a team July 2, 2018 21:24
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@madhavsuresh
Copy link
Author

pkg/sql/opt/exec/execbuilder/testdata/distsql_numtables, line 146 at r1 (raw file):

SELECT "URL" FROM [EXPLAIN (DISTSQL) SELECT y FROM NumToStr WHERE y < 1000 OR y > 9000 ORDER BY y DESC LIMIT 5]
----
https://cockroachdb.github.io/distsqlplan/decode.html?eJyskTFr8zAQhvfvV4SbFRI58aLJa-AjKWm34kG1jiCwfeZ0hpbg_15sFVK3sZNCRp30vM_L6Qw1OdzbCgOYV9CgIIVcQcNUYAjE_Tg-2rl3MGsFvm5a6ce5goIYwZxBvJQIBl7sW4lHtA55tQYFDsX6cohu2FeWP7K6rYSCMCg4tGIWmYa8U0CtXGKD2BOC0Z26X_1MLMirZGzN9BIU_PeVl0U6KUomRZd8YoeM7nd83l1ps6clNavt-PWtGptRDX3_qvWDV31D_bXqzWNXfUV0xNBQHXAkmkpe9_-A7oTx3wK1XOATUzFo4vEwcMPAYZB4q-NhV8ervuB3WM_CyQjWP-FkFk7nzZtZeDsPb_9UO-_-fQYAAP__GoVT2w==

I wasn't entirely sure about this change. It seemed to make sense, but I'm confused by as to why there is no annotation that the scan is a reverse scan. Additionally, I'm confused as to why there's only a scan on one node.


Comments from Reviewable

@rytaft
Copy link
Collaborator

rytaft commented Jul 2, 2018

[nit] fix typo in commit message (support logical support)

Looks good to me, but I'd wait for Radu to comment on that strange DistSQL plan...


Reviewed 12 of 12 files at r1, 1 of 1 files at r2.
Review status: :shipit: complete! 0 of 0 LGTMs obtained


pkg/sql/opt/memo/memo_format.go, line 280 at r1 (raw file):

			fmt.Fprintf(f.buf, " %s@%s", tab.TabName(), tab.Index(t.Index).IdxName())
		}
		fmt.Fprintf(f.buf, ",rev=%t", t.Reverse)

To avoid cluttering the test files, I'd only include this if t.Reverse is true.


pkg/sql/opt/memo/private_defs.go, line 73 at r1 (raw file):

// ScanOpDef defines the value of the Def private field of the Scan operator.
type ScanOpDef struct {

[nit] extra line


pkg/sql/opt/memo/private_defs.go, line 114 at r1 (raw file):

			ordering.AppendCol(colID, !indexCol.Descending)
		} else {
			ordering.AppendCol(colID, indexCol.Descending)

You could do something like ordering.AppendCol(colID, indexCol.Descending != s.Reverse)

But I think your approach might actually be better since it's very clearly correct. Up to you....


pkg/sql/opt/memo/private_defs.go, line 115 at r1 (raw file):

		} else {
			ordering.AppendCol(colID, indexCol.Descending)

[nit] extra line


Comments from Reviewable

@madhavsuresh madhavsuresh force-pushed the revscan branch 2 times, most recently from f05331f to 384f21b Compare July 2, 2018 22:01
@madhavsuresh
Copy link
Author

TFTR! Fixed the typo, sounds good on waiting to discuss the DistSQL plan.


Review status: :shipit: complete! 0 of 0 LGTMs obtained


pkg/sql/opt/memo/memo_format.go, line 280 at r1 (raw file):

Previously, rytaft wrote…

To avoid cluttering the test files, I'd only include this if t.Reverse is true.

Done.


pkg/sql/opt/memo/private_defs.go, line 73 at r1 (raw file):

Previously, rytaft wrote…

[nit] extra line

Done.


pkg/sql/opt/memo/private_defs.go, line 114 at r1 (raw file):

Previously, rytaft wrote…

You could do something like ordering.AppendCol(colID, indexCol.Descending != s.Reverse)

But I think your approach might actually be better since it's very clearly correct. Up to you....

Yea I see what you're saying. However, at the risk of being too verbose, the current if statement is easier for me to understand.


pkg/sql/opt/memo/private_defs.go, line 115 at r1 (raw file):

Previously, rytaft wrote…

[nit] extra line

Done.


Comments from Reviewable

@justinj
Copy link
Contributor

justinj commented Jul 3, 2018

We probably want to cost this as a bit more expensive than a normal scan, I think? I remember hearing anecdotally that they're ~3-5x slower than forward scans in RocksDB. That's definitely a concern that can be pushed down the road though (maybe consult Peter about the relative rate, I remember him talking about it a while back).


Review status: :shipit: complete! 0 of 0 LGTMs obtained


pkg/sql/opt/xform/custom_funcs.go, line 86 at r3 (raw file):

		// then construct a new Scan operator using that index.
		if scanOpDef.Cols.SubsetOf(indexCols) {
			newDef := &memo.ScanOpDef{Table: scanOpDef.Table, Index: i, Cols: scanOpDef.Cols, Reverse: false}

up to you, but since false is the zero-value of bool, you can omit the Reverse: false, which still seems reasonably semantic to me, since non-reverse is the "default".


pkg/sql/opt/xform/custom_funcs.go, line 109 at r3 (raw file):

				Index:   i,
				Cols:    scanCols,
				Reverse: false,

same here


pkg/sql/opt/xform/testdata/rules/scan, line 36 at r3 (raw file):

# --------------------------------------------------

# Revscan won't be used here because there is no index with f

If there was an f DESC index, wouldn't that be serviced by a forward scan?


Comments from Reviewable

@petermattis
Copy link
Collaborator

We probably want to cost this as a bit more expensive than a normal scan, I think? I remember hearing anecdotally that they're ~3-5x slower than forward scans in RocksDB. That's definitely a concern that can be pushed down the road though (maybe consult Peter about the relative rate, I remember him talking about it a while back).

The performance diff depends on the number of historic versions. Assuming 2-3x slower seems reasonable (below, old is forward scan and new is reverse scan).

name                                                   old time/op    new time/op    delta
MVCCScan_RocksDB/rows=1000/versions=1/valueSize=8-8       203µs ± 1%     341µs ± 2%   +68.06%  (p=0.000 n=8+10)
MVCCScan_RocksDB/rows=1000/versions=2/valueSize=8-8       264µs ± 2%     742µs ± 4%  +181.42%  (p=0.000 n=10+10)
MVCCScan_RocksDB/rows=1000/versions=10/valueSize=8-8      700µs ± 2%    2346µs ± 2%  +235.15%  (p=0.000 n=10+10)
MVCCScan_RocksDB/rows=1000/versions=100/valueSize=8-8    4.54ms ± 1%    8.82ms ± 3%   +94.32%  (p=0.000 n=9+10)

Review status: :shipit: complete! 0 of 0 LGTMs obtained


Comments from Reviewable

@RaduBerinde
Copy link
Member

This looks great! Nice work!

Yeah, I would add a 2x factor for reverse scans (grep for computeScanCost). This is a low-hanging improvement over the old planner.

Now that we have revscans, the next thing you can do (in another PR) is to make a rule analogous to ReplaceMinWithLimit but for MAX. That would resolve a few TODOs in tests.


Review status: :shipit: complete! 0 of 0 LGTMs obtained


pkg/sql/opt/exec/execbuilder/testdata/distsql_numtables, line 146 at r1 (raw file):

Previously, madhavsuresh wrote…

I wasn't entirely sure about this change. It seemed to make sense, but I'm confused by as to why there is no annotation that the scan is a reverse scan. Additionally, I'm confused as to why there's only a scan on one node.

This looks goo. I checked and there is no annotation for reverse vs non-reverse. It runs on one node because the limit gets pushed into the TableReader and it likely will only need to get results from that one node. Note that the node is correct, it is node 5 which stores the last ranges.

Note that TableReaders can do remote KVs if some rows are not on that node (it was an explicit design decision to not require strict placement for correctness).

BTW you can also check out the "old planner" versions of these tests: https://github.com/cockroachdb/cockroach/blob/master/pkg/sql/logictest/testdata/planner_test/distsql_numtables#L141


pkg/sql/opt/xform/custom_funcs.go, line 127 at r3 (raw file):

			}

			join := memo.MakeIndexJoinExpr(input, c.e.mem.InternIndexJoinDef(&def))

[nit] maybe call Intern just once since it's the same arg


Comments from Reviewable

@rytaft
Copy link
Collaborator

rytaft commented Jul 3, 2018

Reviewed 3 of 3 files at r3.
Review status: :shipit: complete! 0 of 0 LGTMs obtained


pkg/sql/opt/memo/private_defs.go, line 114 at r1 (raw file):

Previously, madhavsuresh wrote…

Yea I see what you're saying. However, at the risk of being too verbose, the current if statement is easier for me to understand.

👍


Comments from Reviewable

@madhavsuresh
Copy link
Author

TFTR! Added cost for reverse scans. After changing the cost, reverse scans don't seem favorable at all. This makes sense when the sort is in memory, but goes against my intuition for disk sorts.


Review status: :shipit: complete! 0 of 0 LGTMs obtained


pkg/sql/opt/xform/custom_funcs.go, line 86 at r3 (raw file):

Previously, justinj (Justin Jaffray) wrote…

up to you, but since false is the zero-value of bool, you can omit the Reverse: false, which still seems reasonably semantic to me, since non-reverse is the "default".

Done.


pkg/sql/opt/xform/custom_funcs.go, line 109 at r3 (raw file):

Previously, justinj (Justin Jaffray) wrote…

same here

Done.


pkg/sql/opt/xform/custom_funcs.go, line 127 at r3 (raw file):

Previously, RaduBerinde wrote…

[nit] maybe call Intern just once since it's the same arg

Done.


pkg/sql/opt/xform/testdata/rules/scan, line 36 at r3 (raw file):

Previously, justinj (Justin Jaffray) wrote…

If there was an f DESC index, wouldn't that be serviced by a forward scan?

You're right, thanks for catching this. Should update it to, there's no index with "f asc, k desc"


Comments from Reviewable

@rytaft
Copy link
Collaborator

rytaft commented Jul 3, 2018

Yea, we haven't added costing for disk sorts yet. Maybe when you add the rule with MAX + limit 1, reverse scans will be favorable.


Comments from Reviewable

@andy-kimball
Copy link
Contributor

Review status: :shipit: complete! 0 of 0 LGTMs obtained


pkg/sql/opt/xform/coster.go, line 132 at r4 (raw file):

	def := candidate.Private(c.mem).(*memo.ScanOpDef)
	rowCount := memo.Cost(logical.Relational.Stats.RowCount)
	reverseMultiplier := memo.Cost(1)

Radu should weigh in here, but I don't think the multiplier should be applied to the IO cost, but instead to the CPU cost (i.e. rowScanCost portion). Iterating in reverse doesn't change how costly it is to fetch the rows from disk/network into memory, but instead on how costly it is to iterate once in memory.


Comments from Reviewable

@RaduBerinde
Copy link
Member

This looks good to me. It looks that it is favored in the important cases - the ones where it allows LIMIT to be pushed into the scan.

Perhaps for large tables (with stats) revscan will be favored over forward scan + sort (because of the log factor in the sort cost).


Review status: :shipit: complete! 0 of 0 LGTMs obtained


Comments from Reviewable

@RaduBerinde
Copy link
Member

Review status: :shipit: complete! 0 of 0 LGTMs obtained


pkg/sql/opt/xform/coster.go, line 132 at r4 (raw file):

Previously, andy-kimball (Andy Kimball) wrote…

Radu should weigh in here, but I don't think the multiplier should be applied to the IO cost, but instead to the CPU cost (i.e. rowScanCost portion). Iterating in reverse doesn't change how costly it is to fetch the rows from disk/network into memory, but instead on how costly it is to iterate once in memory.

In practice it is 2-3x more expensive overall, so a 2x factor for everything seems fine to me. I think the IO pattern is less efficient in this case, or we wouldn't see that big of a difference.

Note that a rev scan is not merely a forward scan + a reverse in-memory iteration. You can do a rev scan on a huge table to get just the last few rows and this doesn't read the entire table.


Comments from Reviewable

@madhavsuresh
Copy link
Author

Review status: :shipit: complete! 0 of 0 LGTMs obtained


pkg/sql/opt/xform/coster.go, line 132 at r4 (raw file):

Previously, RaduBerinde wrote…

In practice it is 2-3x more expensive overall, so a 2x factor for everything seems fine to me. I think the IO pattern is less efficient in this case, or we wouldn't see that big of a difference.

Note that a rev scan is not merely a forward scan + a reverse in-memory iteration. You can do a rev scan on a huge table to get just the last few rows and this doesn't read the entire table.

I based this off the table that Peter provided saying that there's ~2x performance penalty overall. Similar to what Radu said, the performance hit seemed, on average, a total factor of 2x. There is probably some expression which is a more accurate representation of the reverse scan cost - one that takes into account the CPU cost and the IO pattern independently. That seems challenging to construct given the limited information we have.


Comments from Reviewable

@petermattis
Copy link
Collaborator

Review status: :shipit: complete! 0 of 0 LGTMs obtained


pkg/sql/opt/xform/coster.go, line 132 at r4 (raw file):

Previously, madhavsuresh wrote…

I based this off the table that Peter provided saying that there's ~2x performance penalty overall. Similar to what Radu said, the performance hit seemed, on average, a total factor of 2x. There is probably some expression which is a more accurate representation of the reverse scan cost - one that takes into account the CPU cost and the IO pattern independently. That seems challenging to construct given the limited information we have.

In the benchmark numbers I posted, the perf diff between a forward and reverse scan is entirely CPU. The amount of IO performed is the same.


Comments from Reviewable

@madhavsuresh
Copy link
Author

Review status: :shipit: complete! 0 of 0 LGTMs obtained


pkg/sql/opt/xform/coster.go, line 132 at r4 (raw file):

Previously, petermattis (Peter Mattis) wrote…

In the benchmark numbers I posted, the perf diff between a forward and reverse scan is entirely CPU. The amount of IO performed is the same.

Got it, thanks for clarifying! In that case, it makes sense to push in the multiplier to the rowScanCost as @andy-kimball mentioned. The updated plans use the reverse scan in more cases.


Comments from Reviewable

@madhavsuresh madhavsuresh force-pushed the revscan branch 2 times, most recently from 17223f7 to 1890280 Compare July 3, 2018 18:54
Copy link
Author

@madhavsuresh madhavsuresh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ping @rytaft, @RaduBerinde

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained

rytaft
rytaft previously requested changes Jul 9, 2018
Copy link
Collaborator

@rytaft rytaft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 20 of 20 files at r5.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale)


pkg/sql/opt/xform/custom_funcs.go, line 126 at r5 (raw file):

			}

			memoDef := c.e.mem.InternIndexJoinDef(&def)

[super nit] I think by convention we've been calling this private instead of memoDef

The optimizer does not generate plans which include reverse scans.
This PR adds logical support for reverse scans.

Release note: None
Copy link
Author

@madhavsuresh madhavsuresh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bors r+

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale)


pkg/sql/opt/xform/custom_funcs.go, line 126 at r5 (raw file):

Previously, rytaft wrote…

[super nit] I think by convention we've been calling this private instead of memoDef

Done.

@craig
Copy link
Contributor

craig bot commented Jul 9, 2018

👎 Rejected by code reviews

Copy link
Author

@madhavsuresh madhavsuresh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bors r+

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale)

@craig
Copy link
Contributor

craig bot commented Jul 9, 2018

👎 Rejected by code reviews

Copy link
Author

@madhavsuresh madhavsuresh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dismissed @justinj from 3 discussions.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale)

@madhavsuresh madhavsuresh dismissed rytaft’s stale review July 9, 2018 15:36

the comments were resolved, that this is unresolved seems to be a bug.

Copy link
Author

@madhavsuresh madhavsuresh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bors r+

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale)

craig bot pushed a commit that referenced this pull request Jul 9, 2018
27110: opt: add support for reverse scans r=madhavsuresh a=madhavsuresh

The optimizer does not generate plans which include reverse scans.
This PR adds support logical support for reverse scans.

Release note: None

27145: ui: depend on babel-polyfill r=couchand a=benesch

babel-polyfill is the standard means of importing the necessary set of
polyfills for the configured babel preset. We were previously forced to
manually importing the polyfills we needed because the combination of
Node 6, JSDom, and babel-polyfill was broken. Since we're no longer
using JSDom (see 7e89701), we can use babel-polyfill.

Release note: None

27204: sql/parser: unreserve VIEW r=knz a=knz

Found while looking at #26993.

VIEW does not need to be a reserved keyword. In fact, it sounds
innocuous enough for users to want to use it as a column name. So make
it an unreserved keyword.

Release note (sql change): CockroachDB now allows clients to use the
word `view` as identifier like in PostgreSQL.

Co-authored-by: madhavsuresh <[email protected]>
Co-authored-by: Nikhil Benesch <[email protected]>
Co-authored-by: Raphael 'kena' Poss <[email protected]>
@craig
Copy link
Contributor

craig bot commented Jul 9, 2018

Build succeeded

@craig craig bot merged commit 15a0507 into cockroachdb:master Jul 9, 2018
madhavsuresh pushed a commit to madhavsuresh/cockroach that referenced this pull request Jul 9, 2018
This PR is a follow up to cockroachdb#26905 and cockroachdb#27110. It adds an exploration
rule to replace MAX(x) with LIMIT 1 on a sorted index. Given the a
ddition of reverse scans to the optimizer in cockroachdb#27110, the
MAX(x) -> LIMIT 1 transformation is now feasible.

Note that the exploration rule ReplaceMinWithLimit is effectively
copied over to ReplaceMaxWithLimit. Ideally OpName would
have been used and the rule would be in the form below. However,
due to limitations of OpName in explorations rules, this was
not feasible.

```
[ReplaceMinMaxWithLimit, Explore]
(GroupBy
    $input:*
    (Aggregations [$op:(Max|Max $variable:(Variable $col:*))] $cols:*)
    $def:* & (IsScalarGroupBy $def)
)
=>
(GroupBy
    (Limit
        (Select
            $input
            (Filters [(IsNot $variable (Null (AnyType)))])
        )
        (MakeOne)
        (MakeOrderingChoiceFromColumn (OpName $op) $col)
    )
    (Aggregations [(AnyNotNull $variable)] $cols)
    $def
)
```
madhavsuresh pushed a commit to madhavsuresh/cockroach that referenced this pull request Jul 9, 2018
This PR is a follow up to cockroachdb#26905 and cockroachdb#27110. It adds an exploration
rule to replace MAX(x) with LIMIT 1 on a sorted index. Given the a
ddition of reverse scans to the optimizer in cockroachdb#27110, the
MAX(x) -> LIMIT 1 transformation is now feasible.

Note that the exploration rule ReplaceMinWithLimit is effectively
copied over to ReplaceMaxWithLimit. Ideally OpName would
have been used and the rule would be in the form below. However,
due to limitations of OpName in explorations rules, this was
not feasible.

```
[ReplaceMinMaxWithLimit, Explore]
(GroupBy
    $input:*
    (Aggregations [$op:(Max|Max $variable:(Variable $col:*))] $cols:*)
    $def:* & (IsScalarGroupBy $def)
)
=>
(GroupBy
    (Limit
        (Select
            $input
            (Filters [(IsNot $variable (Null (AnyType)))])
        )
        (MakeOne)
        (MakeOrderingChoiceFromColumn (OpName $op) $col)
    )
    (Aggregations [(AnyNotNull $variable)] $cols)
    $def
)
```
madhavsuresh pushed a commit to madhavsuresh/cockroach that referenced this pull request Jul 9, 2018
This PR is a follow up to cockroachdb#26905 and cockroachdb#27110. It adds an exploration
rule to replace MAX(x) with LIMIT 1 on a sorted index. Given the
addition of reverse scans to the optimizer in cockroachdb#27110, the
MAX(x) -> LIMIT 1 transformation is now feasible.

Note that the exploration rule ReplaceMinWithLimit is effectively
copied over to ReplaceMaxWithLimit. Ideally OpName would
have been used and the rule would be in the form below. However,
due to limitations of OpName in explorations rules, this was
not feasible.

```
[ReplaceMinMaxWithLimit, Explore]
(GroupBy
    $input:*
    (Aggregations [$op:(Max|Max $variable:(Variable $col:*))] $cols:*)
    $def:* & (IsScalarGroupBy $def)
)
=>
(GroupBy
    (Limit
        (Select
            $input
            (Filters [(IsNot $variable (Null (AnyType)))])
        )
        (MakeOne)
        (MakeOrderingChoiceFromColumn (OpName $op) $col)
    )
    (Aggregations [(AnyNotNull $variable)] $cols)
    $def
)
```
madhavsuresh pushed a commit to madhavsuresh/cockroach that referenced this pull request Jul 9, 2018
This PR is a follow up to cockroachdb#26905 and cockroachdb#27110. It adds an exploration
rule to replace MAX(x) with LIMIT 1 on a sorted index. Given the
addition of reverse scans to the optimizer in cockroachdb#27110, the
MAX(x) -> LIMIT 1 transformation is now feasible.

Note that the exploration rule ReplaceMinWithLimit is effectively
copied over to ReplaceMaxWithLimit. Ideally OpName would
have been used and the rule would be in the form below. However,
due to limitations of OpName in explorations rules, this was
not feasible.

```
[ReplaceMinMaxWithLimit, Explore]
(GroupBy
    $input:*
    (Aggregations [$op:(Max|Max $variable:(Variable $col:*))] $cols:*)
    $def:* & (IsScalarGroupBy $def)
)
=>
(GroupBy
    (Limit
        (Select
            $input
            (Filters [(IsNot $variable (Null (AnyType)))])
        )
        (MakeOne)
        (MakeOrderingChoiceFromColumn (OpName $op) $col)
    )
    (Aggregations [(AnyNotNull $variable)] $cols)
    $def
)
```
madhavsuresh pushed a commit to madhavsuresh/cockroach that referenced this pull request Jul 10, 2018
This PR is a follow up to cockroachdb#26905 and cockroachdb#27110. It adds an exploration
rule to replace MAX(x) with LIMIT 1 on a sorted index. Given the
addition of reverse scans to the optimizer in cockroachdb#27110, the
MAX(x) -> LIMIT 1 transformation is now feasible.

Note that the exploration rule ReplaceMinWithLimit is effectively
copied over to ReplaceMaxWithLimit. Ideally OpName would
have been used and the rule would be in the form below. However,
due to limitations of OpName in explorations rules, this was
not feasible.

```
[ReplaceMinMaxWithLimit, Explore]
(GroupBy
    $input:*
    (Aggregations [$op:(Max|Max $variable:(Variable $col:*))] $cols:*)
    $def:* & (IsScalarGroupBy $def)
)
=>
(GroupBy
    (Limit
        (Select
            $input
            (Filters [(IsNot $variable (Null (AnyType)))])
        )
        (MakeOne)
        (MakeOrderingChoiceFromColumn (OpName $op) $col)
    )
    (Aggregations [(AnyNotNull $variable)] $cols)
    $def
)
```
craig bot pushed a commit that referenced this pull request Jul 10, 2018
27286: opt: exploration rule for MAX(x) on index r=madhavsuresh a=madhavsuresh

This PR is a follow up to #26905 and #27110. It adds an exploration
rule to replace MAX(x) with LIMIT 1 on a sorted index. Given the a
ddition of reverse scans to the optimizer in #27110, the
MAX(x) -> LIMIT 1 transformation is now feasible.

Note that the exploration rule ReplaceMinWithLimit is effectively
copied over to ReplaceMaxWithLimit. Ideally OpName would
have been used and the rule would be in the form below. However,
due to limitations of OpName in explorations rules, this was
not feasible.

```
[ReplaceMinMaxWithLimit, Explore]
(GroupBy
    $input:*
    (Aggregations [$op:(Max|Max $variable:(Variable $col:*))] $cols:*)
    $def:* & (IsScalarGroupBy $def)
)
=>
(GroupBy
    (Limit
        (Select
            $input
            (Filters [(IsNot $variable (Null (AnyType)))])
        )
        (MakeOne)
        (MakeOrderingChoiceFromColumn (OpName $op) $col)
    )
    (Aggregations [(AnyNotNull $variable)] $cols)
    $def
)
```

Co-authored-by: madhavsuresh <[email protected]>
andy-kimball pushed a commit to andy-kimball/cockroach that referenced this pull request Jul 10, 2018
This PR is a follow up to cockroachdb#26905 and cockroachdb#27110. It adds an exploration
rule to replace MAX(x) with LIMIT 1 on a sorted index. Given the
addition of reverse scans to the optimizer in cockroachdb#27110, the
MAX(x) -> LIMIT 1 transformation is now feasible.

Note that the exploration rule ReplaceMinWithLimit is effectively
copied over to ReplaceMaxWithLimit. Ideally OpName would
have been used and the rule would be in the form below. However,
due to limitations of OpName in explorations rules, this was
not feasible.

```
[ReplaceMinMaxWithLimit, Explore]
(GroupBy
    $input:*
    (Aggregations [$op:(Max|Max $variable:(Variable $col:*))] $cols:*)
    $def:* & (IsScalarGroupBy $def)
)
=>
(GroupBy
    (Limit
        (Select
            $input
            (Filters [(IsNot $variable (Null (AnyType)))])
        )
        (MakeOne)
        (MakeOrderingChoiceFromColumn (OpName $op) $col)
    )
    (Aggregations [(AnyNotNull $variable)] $cols)
    $def
)
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants