Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: add support for optional frame exclusion #28262

Merged
merged 1 commit into from
Aug 19, 2019

Conversation

yuzefovich
Copy link
Member

@yuzefovich yuzefovich commented Aug 3, 2018

Adds support for optional frame exclusion clause in the specification
of window frames. This feature is described in
https://www.postgresql.org/docs/12/sql-expressions.html.

Whenever a non-default frame exclusion is present, it forces the
execution to fall back to a naive quadratic approach. This means
that for every row in a window frame we will compute the window
function from scratch, without making use of any previous computations.
Additionally, the presence of a non-default frame exclusion breaks
necessary assumptions for utilizing a sliding window approach for
min, max, sum, and avg functions, so those also fall back to a naive
quadratic approach.

Closes: #27100.

Release note (sql change): CockroachDB now supports optional frame
exclusion clause in the specification of window frames.

@yuzefovich yuzefovich requested review from knz, nvanbenschoten and a team August 3, 2018 21:09
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@yuzefovich
Copy link
Member Author

Please ignore the first commit here - it is in a separate PR.

Copy link
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately today I only have time for a casual review. At that level the overall structure is ok but I really can't comment on correctness. I will be able to do a more thorough review in a few day (unless you find another reviewer).

There is a pretty-print error though, I'm pretty sure.

Reviewed 9 of 9 files at r2.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


pkg/sql/sem/tree/pretty.go, line 687 at r2 (raw file):

		)
	}
	if wf.Exclusion != nil {

Are bounds mutually exclusive with an exclusion clause? (Casual inspection suggest they're not)

If not the code as you wrote it will print the wrong output (you're overriding d).
If they are mutually exclusive please use else if

Also, make a test to illustrate this.

Copy link
Member Author

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is of least priority (I haven't thought of a use case for this frame exclusion), so no rush with the review. It depends on #28244, I definitely want to merge that one first.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


pkg/sql/sem/tree/pretty.go, line 687 at r2 (raw file):

Previously, knz (kena) wrote…

Are bounds mutually exclusive with an exclusion clause? (Casual inspection suggest they're not)

If not the code as you wrote it will print the wrong output (you're overriding d).
If they are mutually exclusive please use else if

Also, make a test to illustrate this.

You're right - they are not mutually exclusive. But I'm not sure what the correct way to "append" information about frame exclusion and how to test it - I welcome any hints.

@yuzefovich yuzefovich requested a review from a team August 15, 2018 18:43
@yuzefovich yuzefovich force-pushed the wf-exclusion branch 2 times, most recently from 1701532 to 3da0a45 Compare August 16, 2018 13:33
@yuzefovich yuzefovich requested a review from jordanlewis August 16, 2018 13:34
Copy link
Member Author

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@knz @nvanbenschoten @jordanlewis It is now ready for review, please take a look when you get a chance.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


pkg/sql/sem/tree/pretty.go, line 687 at r2 (raw file):

Previously, yuzefovich wrote…

You're right - they are not mutually exclusive. But I'm not sure what the correct way to "append" information about frame exclusion and how to test it - I welcome any hints.

@knz what is the correct way to include information about frame exclusion? And how would I test it?

Copy link
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 15 of 15 files at r3.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


pkg/sql/sem/tree/pretty.go, line 687 at r2 (raw file):

Previously, yuzefovich wrote…

@knz what is the correct way to include information about frame exclusion? And how would I test it?

simply: d = pretty.ConcatSpace(d, p.Doc(wf.Exclusion))

together with my other suggested changes


pkg/sql/sem/tree/pretty.go, line 713 at r3 (raw file):

}

func (node *WindowFrameExclusion) doc(p *PrettyCfg) pretty.Doc {

Remove this function.


pkg/sql/sem/tree/select.go, line 771 at r3 (raw file):

	switch *node {
	case ExcludeCurrentRow:
		ctx.WriteString(" EXCLUDE CURRENT ROW")

Factor the "EXCLUDE" prefix to the top of the switch. Do not prepend a space here.


pkg/sql/sem/tree/select.go, line 804 at r3 (raw file):

	}
	if node.Exclusion != nil {
		ctx.FormatNode(node.Exclusion)

add ctx.WriteByte(' ') before the call to FormatNode here.

Copy link
Member Author

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


pkg/sql/sem/tree/pretty.go, line 687 at r2 (raw file):

Previously, knz (kena) wrote…

simply: d = pretty.ConcatSpace(d, p.Doc(wf.Exclusion))

together with my other suggested changes

Thanks, this makes sense.

How would I test this pretty printing? Do I need to create a SQL file with a query in pkg/sql/sem/tree/testdata/pretty/? How would I run it?


pkg/sql/sem/tree/pretty.go, line 713 at r3 (raw file):

Previously, knz (kena) wrote…

Remove this function.

Done.


pkg/sql/sem/tree/select.go, line 771 at r3 (raw file):

Previously, knz (kena) wrote…

Factor the "EXCLUDE" prefix to the top of the switch. Do not prepend a space here.

Done.


pkg/sql/sem/tree/select.go, line 804 at r3 (raw file):

Previously, knz (kena) wrote…

add ctx.WriteByte(' ') before the call to FormatNode here.

Done.

Copy link
Member Author

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I currently have a bug with some non-aggregate window functions (like row_number and rank) - at the moment, they simply ignore frame exclusion. I will fix this shortly and will expand the test coverage.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained

Copy link
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


pkg/sql/sem/tree/pretty.go, line 687 at r2 (raw file):

Previously, yuzefovich wrote…

Thanks, this makes sense.

How would I test this pretty printing? Do I need to create a SQL file with a query in pkg/sql/sem/tree/testdata/pretty/? How would I run it?

Make an example in that directory yes, then run make test PKG=./pkg/sql/sem/tree TESTFLAGS=-rewrite-pretty

then add the new files (inspect them manually to see if they are to your liking)

Copy link
Member Author

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I take that back. Only first_value, last_value and nth_value probably should take frame exclusion into consideration when computing window functions (confirmed this with PG 11 beta 2), and those are handled correctly. I added tests for those three (tests are not very diverse though because it's hard to make deterministic ones while considering boundaries of peer groups).

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


pkg/sql/sem/tree/pretty.go, line 687 at r2 (raw file):

Previously, knz (kena) wrote…

Make an example in that directory yes, then run make test PKG=./pkg/sql/sem/tree TESTFLAGS=-rewrite-pretty

then add the new files (inspect them manually to see if they are to your liking)

Got it, thanks.

@yuzefovich yuzefovich force-pushed the wf-exclusion branch 2 times, most recently from 58f7014 to e882c96 Compare August 16, 2018 18:27
@yuzefovich yuzefovich force-pushed the wf-exclusion branch 2 times, most recently from dbd9d37 to cd1382a Compare April 12, 2019 16:25
@yuzefovich
Copy link
Member Author

@knz @jordanlewis @nvanbenschoten please take a look at this guy :)

Copy link
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 12 of 23 files at r4.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis, @knz, @nvanbenschoten, and @yuzefovich)


pkg/sql/sem/tree/pretty.go, line 687 at r2 (raw file):

Previously, yuzefovich wrote…

Got it, thanks.

I think pretty.Stack will look better than ConcatSpace. Can you try it?

@yuzefovich
Copy link
Member Author

Now that work on window functions in the optimizer is done (at least, for the moment), let's merge this. PTAL.

@yuzefovich yuzefovich force-pushed the wf-exclusion branch 2 times, most recently from bbaaa51 to da5c7b7 Compare August 5, 2019 14:44
Copy link
Contributor

@justinj justinj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you mind adding some tests to optbuilder/testdata/window and execbuilder/testdata/window? Just to make sure the information is getting threaded through OK. Besides that opt changes :lgtm:

Reviewed 28 of 29 files at r5, 2 of 2 files at r6.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @jordanlewis, @knz, @nvanbenschoten, and @yuzefovich)


pkg/sql/logictest/testdata/logic_test/window, line 3006 at r5 (raw file):


query RTR
SELECT price, array_agg(price) OVER w, sum(price) OVER w FROM products WINDOW w AS (ORDER BY price RANGE UNBOUNDED PRECEDING EXCLUDE CURRENT ROW) ORDER BY price

nit: consider sqlfmting these queries

Copy link
Member Author

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added, thanks for the suggestion.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @jordanlewis, @knz, and @nvanbenschoten)


pkg/sql/logictest/testdata/logic_test/window, line 3006 at r5 (raw file):

Previously, justinj (Justin Jaffray) wrote…

nit: consider sqlfmting these queries

Done.

@yuzefovich
Copy link
Member Author

Friendly ping.

Copy link
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have enough knowledge of frame exclusion modes to properly assess the correctness of the feature, but the code is overall consistent with out best practices and the testing is adequate. I left a few minor comments but this is code-wise good to go.

I would suggest two things:

  1. extend the commit message with a link to the pg docs where this feature is explained (if pg supports it now) otherwise docs from some other sql databases that do, so the doc team and our future selves have some reference to look at.

  2. provide some commentary in the commit message about the algorithmic costs, so that we can build a mental model of what to expect in memory/cpu usage.

Reviewed 12 of 29 files at r5, 1 of 2 files at r6, 10 of 10 files at r7.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @jordanlewis, @nvanbenschoten, and @yuzefovich)


pkg/sql/distsqlpb/processors.go, line 89 at r7 (raw file):

		*spec = WindowerSpec_Frame_EXCLUDE_TIES
	default:
		panic("unexpected WindowerFrameExclusion")

panic(errors.NewAssertionErrorf(...))

but really this function should return error


pkg/sql/distsqlpb/processors.go, line 207 at r7 (raw file):

// offset expressions if present in the frame.
func (spec *WindowerSpec_Frame) InitFromAST(f *tree.WindowFrame, evalCtx *tree.EvalContext) error {
	spec.Mode.initFromAST(f.Mode)

and you should test the error return here


pkg/sql/distsqlpb/processors.go, line 253 at r7 (raw file):

		return tree.ExcludeTies
	default:
		panic("unexpected WindowerSpec_Frame_Exclusion")

AssertionFailedf


pkg/sql/sem/builtins/window_frame_builtins.go, line 111 at r7 (raw file):

	if !wfr.DefaultFrameExclusion() {
		// We cannot use sliding window approach because we have frame exclusion

nit: "we cannot use a sliding window approach because we have a frame exclusion clause"


pkg/sql/sem/builtins/window_frame_builtins.go, line 113 at r7 (raw file):

		// We cannot use sliding window approach because we have frame exclusion
		// clause - some rows will be in and out of the frame which breaks the
		// necessary assumption, so we fallback to naive quadratic approach.

"to a naive"


pkg/sql/sem/tree/select.go, line 884 at r7 (raw file):

		ctx.WriteString("TIES")
	default:
		panic(fmt.Sprintf("unhandled case: %d", node))

errors.AssertionFailedf


pkg/sql/sem/tree/window_funcs.go, line 526 at r7 (raw file):

// rows of the partition are inside of the window for each of the rows.
func (wfr *WindowFrameRun) FullPartitionIsInWindow() bool {
	if wfr.Frame == nil || !wfr.NoFilter() || !wfr.DefaultFrameExclusion() {

Is it maybe worth putting this 3-component predicate into a method of WindowFrameRun and use it everywhere. The risk to make a mistake throughout the code is currently too high IMHO.


pkg/sql/sem/tree/window_funcs.go, line 583 at r7 (raw file):

// the window frame of the current row.
func (wfr *WindowFrameRun) isRowExcluded(idx int) bool {
	if wfr.Frame == nil || wfr.Frame.Exclusion == NoExclusion {

if wfr.DefaultFrameExclusion() {


pkg/sql/sem/tree/window_funcs.go, line 599 at r7 (raw file):

		return curRowFirstPeerIdx <= idx && idx < curRowFirstPeerIdx+curRowPeerGroupRowCount && idx != wfr.RowIdx
	default:
		panic("unexpected WindowFrameExclusion")

AssertionFailedf

@yuzefovich yuzefovich force-pushed the wf-exclusion branch 3 times, most recently from ba8b911 to 6280a56 Compare August 16, 2019 14:56
Copy link
Member Author

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review!

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @jordanlewis, @justinj, @knz, and @nvanbenschoten)


pkg/sql/distsqlpb/processors.go, line 89 at r7 (raw file):

Previously, knz (kena) wrote…

panic(errors.NewAssertionErrorf(...))

but really this function should return error

Done.


pkg/sql/distsqlpb/processors.go, line 207 at r7 (raw file):

Previously, knz (kena) wrote…

and you should test the error return here

Done.


pkg/sql/distsqlpb/processors.go, line 253 at r7 (raw file):

Previously, knz (kena) wrote…

AssertionFailedf

Done.


pkg/sql/sem/builtins/window_frame_builtins.go, line 111 at r7 (raw file):

Previously, knz (kena) wrote…

nit: "we cannot use a sliding window approach because we have a frame exclusion clause"

Done.


pkg/sql/sem/builtins/window_frame_builtins.go, line 113 at r7 (raw file):

Previously, knz (kena) wrote…

"to a naive"

Done.


pkg/sql/sem/tree/select.go, line 884 at r7 (raw file):

Previously, knz (kena) wrote…

errors.AssertionFailedf

Done.


pkg/sql/sem/tree/window_funcs.go, line 526 at r7 (raw file):

Previously, knz (kena) wrote…

Is it maybe worth putting this 3-component predicate into a method of WindowFrameRun and use it everywhere. The risk to make a mistake throughout the code is currently too high IMHO.

I think this particular predicate is used only here, but your suggestion prompted me to think more about whether checking if the filter is present should be a part of it (as well as in other places), and I'm not sure anymore that the current version is correct. It's a very tricky stuff.


pkg/sql/sem/tree/window_funcs.go, line 583 at r7 (raw file):

Previously, knz (kena) wrote…

if wfr.DefaultFrameExclusion() {

Done.


pkg/sql/sem/tree/window_funcs.go, line 599 at r7 (raw file):

Previously, knz (kena) wrote…

AssertionFailedf

Done.

@yuzefovich yuzefovich changed the title sql: Add support for optional frame exclusion sql: add support for optional frame exclusion Aug 16, 2019
@yuzefovich yuzefovich added the do-not-merge bors won't merge a PR with this label. label Aug 16, 2019
@yuzefovich
Copy link
Member Author

Adding a do-not-merge label as a reminder for myself to think more about the filter clause.

@yuzefovich yuzefovich force-pushed the wf-exclusion branch 2 times, most recently from 227f100 to 9f37330 Compare August 17, 2019 22:08
@yuzefovich yuzefovich removed the do-not-merge bors won't merge a PR with this label. label Aug 17, 2019
Adds support for optional frame exclusion clause in the specification
of window frames. This feature is described in
https://www.postgresql.org/docs/12/sql-expressions.html.

Whenever a non-default frame exclusion is present, it forces the
execution to fall back to a naive quadratic approach. This means
that for every row in a window frame we will compute the window
function from scratch, without making use of any previous computations.
Additionally, the presence of a non-default frame exclusion breaks
necessary assumptions for utilizing a sliding window approach for
min, max, sum, and avg functions, so those also fall back to a naive
quadratic approach.

Release note (sql change): CockroachDB now supports optional frame
exclusion clause in the specification of window frames.
Copy link
Member Author

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The check for whether a filter is present is not needed in many cases where I added it in this PR, so I removed those checks.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @jordanlewis, @justinj, @knz, and @nvanbenschoten)


pkg/sql/sem/tree/window_funcs.go, line 526 at r7 (raw file):

Previously, yuzefovich wrote…

I think this particular predicate is used only here, but your suggestion prompted me to think more about whether checking if the filter is present should be a part of it (as well as in other places), and I'm not sure anymore that the current version is correct. It's a very tricky stuff.

Okay, so checking the filter here is not necessary.

@yuzefovich
Copy link
Member Author

Unless someone objects, I'll merge this on Monday.

Copy link
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code LGTM. I'll trust you and other reviewers on the feature's correctness.

Reviewed 1 of 29 files at r5, 14 of 14 files at r8.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @jordanlewis)

@yuzefovich
Copy link
Member Author

Thanks everyone for the review!

bors r+

craig bot pushed a commit that referenced this pull request Aug 19, 2019
28262: sql: add support for optional frame exclusion r=yuzefovich a=yuzefovich

Adds support for optional frame exclusion clause in the specification
of window frames. This feature is described in
https://www.postgresql.org/docs/12/sql-expressions.html.

Whenever a non-default frame exclusion is present, it forces the
execution to fall back to a naive quadratic approach. This means
that for every row in a window frame we will compute the window
function from scratch, without making use of any previous computations.
Additionally, the presence of a non-default frame exclusion breaks
necessary assumptions for utilizing a sliding window approach for
min, max, sum, and avg functions, so those also fall back to a naive
quadratic approach.

Closes: #27100.

Release note (sql change): CockroachDB now supports optional frame
exclusion clause in the specification of window frames.

Co-authored-by: Yahor Yuzefovich <[email protected]>
@craig
Copy link
Contributor

craig bot commented Aug 19, 2019

Build succeeded

@craig craig bot merged commit b347575 into cockroachdb:master Aug 19, 2019
@yuzefovich yuzefovich deleted the wf-exclusion branch August 30, 2019 19:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

sql: add full support for window frames
7 participants