-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: add support for optional frame exclusion #28262
Conversation
Please ignore the first commit here - it is in a separate PR. |
052dbdc
to
018d9ff
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately today I only have time for a casual review. At that level the overall structure is ok but I really can't comment on correctness. I will be able to do a more thorough review in a few day (unless you find another reviewer).
There is a pretty-print error though, I'm pretty sure.
Reviewed 9 of 9 files at r2.
Reviewable status: complete! 0 of 0 LGTMs obtained
pkg/sql/sem/tree/pretty.go, line 687 at r2 (raw file):
) } if wf.Exclusion != nil {
Are bounds mutually exclusive with an exclusion clause? (Casual inspection suggest they're not)
If not the code as you wrote it will print the wrong output (you're overriding d
).
If they are mutually exclusive please use else if
Also, make a test to illustrate this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR is of least priority (I haven't thought of a use case for this frame exclusion), so no rush with the review. It depends on #28244, I definitely want to merge that one first.
Reviewable status: complete! 0 of 0 LGTMs obtained
pkg/sql/sem/tree/pretty.go, line 687 at r2 (raw file):
Previously, knz (kena) wrote…
Are bounds mutually exclusive with an exclusion clause? (Casual inspection suggest they're not)
If not the code as you wrote it will print the wrong output (you're overriding
d
).
If they are mutually exclusive please useelse if
Also, make a test to illustrate this.
You're right - they are not mutually exclusive. But I'm not sure what the correct way to "append" information about frame exclusion and how to test it - I welcome any hints.
018d9ff
to
7a9046e
Compare
1701532
to
3da0a45
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@knz @nvanbenschoten @jordanlewis It is now ready for review, please take a look when you get a chance.
Reviewable status: complete! 0 of 0 LGTMs obtained
pkg/sql/sem/tree/pretty.go, line 687 at r2 (raw file):
Previously, yuzefovich wrote…
You're right - they are not mutually exclusive. But I'm not sure what the correct way to "append" information about frame exclusion and how to test it - I welcome any hints.
@knz what is the correct way to include information about frame exclusion? And how would I test it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 15 of 15 files at r3.
Reviewable status: complete! 0 of 0 LGTMs obtained
pkg/sql/sem/tree/pretty.go, line 687 at r2 (raw file):
Previously, yuzefovich wrote…
@knz what is the correct way to include information about frame exclusion? And how would I test it?
simply: d = pretty.ConcatSpace(d, p.Doc(wf.Exclusion))
together with my other suggested changes
pkg/sql/sem/tree/pretty.go, line 713 at r3 (raw file):
} func (node *WindowFrameExclusion) doc(p *PrettyCfg) pretty.Doc {
Remove this function.
pkg/sql/sem/tree/select.go, line 771 at r3 (raw file):
switch *node { case ExcludeCurrentRow: ctx.WriteString(" EXCLUDE CURRENT ROW")
Factor the "EXCLUDE"
prefix to the top of the switch
. Do not prepend a space here.
pkg/sql/sem/tree/select.go, line 804 at r3 (raw file):
} if node.Exclusion != nil { ctx.FormatNode(node.Exclusion)
add ctx.WriteByte(' ')
before the call to FormatNode here.
3da0a45
to
935eaa6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained
pkg/sql/sem/tree/pretty.go, line 687 at r2 (raw file):
Previously, knz (kena) wrote…
simply:
d = pretty.ConcatSpace(d, p.Doc(wf.Exclusion))
together with my other suggested changes
Thanks, this makes sense.
How would I test this pretty printing? Do I need to create a SQL file with a query in pkg/sql/sem/tree/testdata/pretty/
? How would I run it?
pkg/sql/sem/tree/pretty.go, line 713 at r3 (raw file):
Previously, knz (kena) wrote…
Remove this function.
Done.
pkg/sql/sem/tree/select.go, line 771 at r3 (raw file):
Previously, knz (kena) wrote…
Factor the
"EXCLUDE"
prefix to the top of theswitch
. Do not prepend a space here.
Done.
pkg/sql/sem/tree/select.go, line 804 at r3 (raw file):
Previously, knz (kena) wrote…
add
ctx.WriteByte(' ')
before the call to FormatNode here.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I currently have a bug with some non-aggregate window functions (like row_number
and rank
) - at the moment, they simply ignore frame exclusion. I will fix this shortly and will expand the test coverage.
Reviewable status: complete! 0 of 0 LGTMs obtained
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained
pkg/sql/sem/tree/pretty.go, line 687 at r2 (raw file):
Previously, yuzefovich wrote…
Thanks, this makes sense.
How would I test this pretty printing? Do I need to create a SQL file with a query in
pkg/sql/sem/tree/testdata/pretty/
? How would I run it?
Make an example in that directory yes, then run make test PKG=./pkg/sql/sem/tree TESTFLAGS=-rewrite-pretty
then add the new files (inspect them manually to see if they are to your liking)
935eaa6
to
52e482d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually I take that back. Only first_value
, last_value
and nth_value
probably should take frame exclusion into consideration when computing window functions (confirmed this with PG 11 beta 2), and those are handled correctly. I added tests for those three (tests are not very diverse though because it's hard to make deterministic ones while considering boundaries of peer groups).
Reviewable status: complete! 0 of 0 LGTMs obtained
pkg/sql/sem/tree/pretty.go, line 687 at r2 (raw file):
Previously, knz (kena) wrote…
Make an example in that directory yes, then run
make test PKG=./pkg/sql/sem/tree TESTFLAGS=-rewrite-pretty
then add the new files (inspect them manually to see if they are to your liking)
Got it, thanks.
58f7014
to
e882c96
Compare
dbd9d37
to
cd1382a
Compare
@knz @jordanlewis @nvanbenschoten please take a look at this guy :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 12 of 23 files at r4.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis, @knz, @nvanbenschoten, and @yuzefovich)
pkg/sql/sem/tree/pretty.go, line 687 at r2 (raw file):
Previously, yuzefovich wrote…
Got it, thanks.
I think pretty.Stack
will look better than ConcatSpace
. Can you try it?
Now that work on window functions in the optimizer is done (at least, for the moment), let's merge this. PTAL. |
bbaaa51
to
da5c7b7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you mind adding some tests to optbuilder/testdata/window
and execbuilder/testdata/window
? Just to make sure the information is getting threaded through OK. Besides that opt changes
Reviewed 28 of 29 files at r5, 2 of 2 files at r6.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @jordanlewis, @knz, @nvanbenschoten, and @yuzefovich)
pkg/sql/logictest/testdata/logic_test/window, line 3006 at r5 (raw file):
query RTR SELECT price, array_agg(price) OVER w, sum(price) OVER w FROM products WINDOW w AS (ORDER BY price RANGE UNBOUNDED PRECEDING EXCLUDE CURRENT ROW) ORDER BY price
nit: consider sqlfmting these queries
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added, thanks for the suggestion.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @jordanlewis, @knz, and @nvanbenschoten)
pkg/sql/logictest/testdata/logic_test/window, line 3006 at r5 (raw file):
Previously, justinj (Justin Jaffray) wrote…
nit: consider sqlfmting these queries
Done.
da5c7b7
to
99d72d8
Compare
Friendly ping. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have enough knowledge of frame exclusion modes to properly assess the correctness of the feature, but the code is overall consistent with out best practices and the testing is adequate. I left a few minor comments but this is code-wise good to go.
I would suggest two things:
-
extend the commit message with a link to the pg docs where this feature is explained (if pg supports it now) otherwise docs from some other sql databases that do, so the doc team and our future selves have some reference to look at.
-
provide some commentary in the commit message about the algorithmic costs, so that we can build a mental model of what to expect in memory/cpu usage.
Reviewed 12 of 29 files at r5, 1 of 2 files at r6, 10 of 10 files at r7.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @jordanlewis, @nvanbenschoten, and @yuzefovich)
pkg/sql/distsqlpb/processors.go, line 89 at r7 (raw file):
*spec = WindowerSpec_Frame_EXCLUDE_TIES default: panic("unexpected WindowerFrameExclusion")
panic(errors.NewAssertionErrorf(...))
but really this function should return error
pkg/sql/distsqlpb/processors.go, line 207 at r7 (raw file):
// offset expressions if present in the frame. func (spec *WindowerSpec_Frame) InitFromAST(f *tree.WindowFrame, evalCtx *tree.EvalContext) error { spec.Mode.initFromAST(f.Mode)
and you should test the error return here
pkg/sql/distsqlpb/processors.go, line 253 at r7 (raw file):
return tree.ExcludeTies default: panic("unexpected WindowerSpec_Frame_Exclusion")
AssertionFailedf
pkg/sql/sem/builtins/window_frame_builtins.go, line 111 at r7 (raw file):
if !wfr.DefaultFrameExclusion() { // We cannot use sliding window approach because we have frame exclusion
nit: "we cannot use a sliding window approach because we have a frame exclusion clause"
pkg/sql/sem/builtins/window_frame_builtins.go, line 113 at r7 (raw file):
// We cannot use sliding window approach because we have frame exclusion // clause - some rows will be in and out of the frame which breaks the // necessary assumption, so we fallback to naive quadratic approach.
"to a naive"
pkg/sql/sem/tree/select.go, line 884 at r7 (raw file):
ctx.WriteString("TIES") default: panic(fmt.Sprintf("unhandled case: %d", node))
errors.AssertionFailedf
pkg/sql/sem/tree/window_funcs.go, line 526 at r7 (raw file):
// rows of the partition are inside of the window for each of the rows. func (wfr *WindowFrameRun) FullPartitionIsInWindow() bool { if wfr.Frame == nil || !wfr.NoFilter() || !wfr.DefaultFrameExclusion() {
Is it maybe worth putting this 3-component predicate into a method of WindowFrameRun
and use it everywhere. The risk to make a mistake throughout the code is currently too high IMHO.
pkg/sql/sem/tree/window_funcs.go, line 583 at r7 (raw file):
// the window frame of the current row. func (wfr *WindowFrameRun) isRowExcluded(idx int) bool { if wfr.Frame == nil || wfr.Frame.Exclusion == NoExclusion {
if wfr.DefaultFrameExclusion() {
pkg/sql/sem/tree/window_funcs.go, line 599 at r7 (raw file):
return curRowFirstPeerIdx <= idx && idx < curRowFirstPeerIdx+curRowPeerGroupRowCount && idx != wfr.RowIdx default: panic("unexpected WindowFrameExclusion")
AssertionFailedf
ba8b911
to
6280a56
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the review!
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @jordanlewis, @justinj, @knz, and @nvanbenschoten)
pkg/sql/distsqlpb/processors.go, line 89 at r7 (raw file):
Previously, knz (kena) wrote…
panic(errors.NewAssertionErrorf(...))
but really this function should return
error
Done.
pkg/sql/distsqlpb/processors.go, line 207 at r7 (raw file):
Previously, knz (kena) wrote…
and you should test the error return here
Done.
pkg/sql/distsqlpb/processors.go, line 253 at r7 (raw file):
Previously, knz (kena) wrote…
AssertionFailedf
Done.
pkg/sql/sem/builtins/window_frame_builtins.go, line 111 at r7 (raw file):
Previously, knz (kena) wrote…
nit: "we cannot use a sliding window approach because we have a frame exclusion clause"
Done.
pkg/sql/sem/builtins/window_frame_builtins.go, line 113 at r7 (raw file):
Previously, knz (kena) wrote…
"to a naive"
Done.
pkg/sql/sem/tree/select.go, line 884 at r7 (raw file):
Previously, knz (kena) wrote…
errors.AssertionFailedf
Done.
pkg/sql/sem/tree/window_funcs.go, line 526 at r7 (raw file):
Previously, knz (kena) wrote…
Is it maybe worth putting this 3-component predicate into a method of
WindowFrameRun
and use it everywhere. The risk to make a mistake throughout the code is currently too high IMHO.
I think this particular predicate is used only here, but your suggestion prompted me to think more about whether checking if the filter is present should be a part of it (as well as in other places), and I'm not sure anymore that the current version is correct. It's a very tricky stuff.
pkg/sql/sem/tree/window_funcs.go, line 583 at r7 (raw file):
Previously, knz (kena) wrote…
if wfr.DefaultFrameExclusion() {
Done.
pkg/sql/sem/tree/window_funcs.go, line 599 at r7 (raw file):
Previously, knz (kena) wrote…
AssertionFailedf
Done.
6280a56
to
2d28fcf
Compare
Adding a do-not-merge label as a reminder for myself to think more about the filter clause. |
227f100
to
9f37330
Compare
Adds support for optional frame exclusion clause in the specification of window frames. This feature is described in https://www.postgresql.org/docs/12/sql-expressions.html. Whenever a non-default frame exclusion is present, it forces the execution to fall back to a naive quadratic approach. This means that for every row in a window frame we will compute the window function from scratch, without making use of any previous computations. Additionally, the presence of a non-default frame exclusion breaks necessary assumptions for utilizing a sliding window approach for min, max, sum, and avg functions, so those also fall back to a naive quadratic approach. Release note (sql change): CockroachDB now supports optional frame exclusion clause in the specification of window frames.
9f37330
to
b347575
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The check for whether a filter is present is not needed in many cases where I added it in this PR, so I removed those checks.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @jordanlewis, @justinj, @knz, and @nvanbenschoten)
pkg/sql/sem/tree/window_funcs.go, line 526 at r7 (raw file):
Previously, yuzefovich wrote…
I think this particular predicate is used only here, but your suggestion prompted me to think more about whether checking if the filter is present should be a part of it (as well as in other places), and I'm not sure anymore that the current version is correct. It's a very tricky stuff.
Okay, so checking the filter here is not necessary.
Unless someone objects, I'll merge this on Monday. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code LGTM. I'll trust you and other reviewers on the feature's correctness.
Reviewed 1 of 29 files at r5, 14 of 14 files at r8.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @jordanlewis)
Thanks everyone for the review! bors r+ |
28262: sql: add support for optional frame exclusion r=yuzefovich a=yuzefovich Adds support for optional frame exclusion clause in the specification of window frames. This feature is described in https://www.postgresql.org/docs/12/sql-expressions.html. Whenever a non-default frame exclusion is present, it forces the execution to fall back to a naive quadratic approach. This means that for every row in a window frame we will compute the window function from scratch, without making use of any previous computations. Additionally, the presence of a non-default frame exclusion breaks necessary assumptions for utilizing a sliding window approach for min, max, sum, and avg functions, so those also fall back to a naive quadratic approach. Closes: #27100. Release note (sql change): CockroachDB now supports optional frame exclusion clause in the specification of window frames. Co-authored-by: Yahor Yuzefovich <[email protected]>
Build succeeded |
Adds support for optional frame exclusion clause in the specification
of window frames. This feature is described in
https://www.postgresql.org/docs/12/sql-expressions.html.
Whenever a non-default frame exclusion is present, it forces the
execution to fall back to a naive quadratic approach. This means
that for every row in a window frame we will compute the window
function from scratch, without making use of any previous computations.
Additionally, the presence of a non-default frame exclusion breaks
necessary assumptions for utilizing a sliding window approach for
min, max, sum, and avg functions, so those also fall back to a naive
quadratic approach.
Closes: #27100.
Release note (sql change): CockroachDB now supports optional frame
exclusion clause in the specification of window frames.