Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

opt: add support for aggregate FILTER #34077

Merged
merged 1 commit into from
Jan 22, 2019
Merged

Conversation

justinj
Copy link
Contributor

@justinj justinj commented Jan 17, 2019

This commit introduces support for FILTERing individual aggregates in a
group by.

One unintuitive change is that if we have a COUNT(*) with a FILTER, we
need to synthesize an input for the aggregation to be over, because
otherwise we have no input to hang the AggFilter off of.

Thus, we convert
COUNT(*) FILTER (WHERE foo)
to
COUNT(true) FILTER (WHERE foo).

This isn't a problem for DISTINCT because COUNT(DISTINCT *) is not valid.

Release note (sql change): FILTER expressions are now supported by the
cost-based optimizer.

@justinj justinj requested review from rytaft, RaduBerinde, andy-kimball and a team January 17, 2019 03:19
@justinj justinj requested a review from a team as a code owner January 17, 2019 03:19
@justinj justinj requested a review from a team January 17, 2019 03:19
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Member

@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


pkg/sql/opt/exec/execbuilder/testdata/aggregate, line 602 at r1 (raw file):

----
group           ·            ·
 │              aggregate 0  count_rows() FILTER (WHERE k > 5)

How do we get count_rows here if we're building it as count(true)? I was expecting some special code in execbuilder that converts it back but can't see any.

Copy link
Contributor Author

@justinj justinj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


pkg/sql/opt/exec/execbuilder/testdata/aggregate, line 602 at r1 (raw file):

Previously, RaduBerinde wrote…

How do we get count_rows here if we're building it as count(true)? I was expecting some special code in execbuilder that converts it back but can't see any.

Ah, I forgot to regenerate these tests, fixed. Do you think that conversion in execbuilder is important?

Copy link
Member

@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained


pkg/sql/opt/exec/execbuilder/testdata/aggregate, line 602 at r1 (raw file):

Previously, justinj (Justin Jaffray) wrote…

Ah, I forgot to regenerate these tests, fixed. Do you think that conversion in execbuilder is important?

Nah, this is not a very common case. Plus the main benefit would be to not render the true column and that would be hard to do in the execbuilder.

Copy link
Collaborator

@rytaft rytaft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 9 of 10 files at r1, 2 of 2 files at r2.
Reviewable status: :shipit: complete! 2 of 0 LGTMs obtained


pkg/sql/opt/exec/execbuilder/testdata/aggregate, line 587 at r2 (raw file):

·               filter       v > 10          ·             ·

# TODO(radu): FILTER not yet supported.

[nit] remove this TODO


pkg/sql/opt/optbuilder/scope.go, line 956 at r2 (raw file):

		}

		s.builder.semaCtx.Properties.Restore(oldProps)

should we be concerned about not restoring props in case of the error above?


pkg/sql/opt/optbuilder/scope.go, line 999 at r2 (raw file):

			if f.Filter != nil {
				// If we have a COUNT(*) with a FILTER, we need to synthesize an input for the aggregation
				// to be over, because otherwise we have no input to hang the AggFilter off of.

[nit] comments should be <= 80 columns


pkg/sql/opt/optbuilder/testdata/aggregate, line 2643 at r2 (raw file):

      └── sum [type=decimal]
           └── agg-filter [type=decimal]
                ├── variable: d [type=decimal]

It's a bit hard to distinguish the input from the filter... maybe it's worth adding labels for input/filter?

Copy link
Contributor

@andy-kimball andy-kimball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 2 of 0 LGTMs obtained


pkg/sql/opt/exec/execbuilder/relational_builder.go, line 642 at r2 (raw file):

			child := item.Agg.Child(0)

			if aggFilter, ok := child.(*memo.AggFilterExpr); ok {

Since it appears we're assuming there's just one AggFilterExpr, and it sits above any AggDistinctExpr, can you add some checking to the Memo.checkExpr function? We already have some checks there for the distinct case.

Copy link
Contributor Author

@justinj justinj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 2 of 0 LGTMs obtained (waiting on @andy-kimball and @rytaft)


pkg/sql/opt/exec/execbuilder/relational_builder.go, line 642 at r2 (raw file):

Previously, andy-kimball (Andy Kimball) wrote…

Since it appears we're assuming there's just one AggFilterExpr, and it sits above any AggDistinctExpr, can you add some checking to the Memo.checkExpr function? We already have some checks there for the distinct case.

Done.


pkg/sql/opt/exec/execbuilder/testdata/aggregate, line 587 at r2 (raw file):

Previously, rytaft wrote…

[nit] remove this TODO

Done.


pkg/sql/opt/optbuilder/scope.go, line 956 at r2 (raw file):

Previously, rytaft wrote…

should we be concerned about not restoring props in case of the error above?

I wouldn't think so, since we bail if that happens, and throw everything away, right?


pkg/sql/opt/optbuilder/scope.go, line 999 at r2 (raw file):

Previously, rytaft wrote…

[nit] comments should be <= 80 columns

Done.


pkg/sql/opt/optbuilder/testdata/aggregate, line 2643 at r2 (raw file):

Previously, rytaft wrote…

It's a bit hard to distinguish the input from the filter... maybe it's worth adding labels for input/filter?

Good idea—done.

Copy link
Collaborator

@rytaft rytaft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 5 of 5 files at r3.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @andy-kimball)


pkg/sql/opt/optbuilder/scope.go, line 956 at r2 (raw file):

Previously, justinj (Justin Jaffray) wrote…

I wouldn't think so, since we bail if that happens, and throw everything away, right?

Yea, I think you're right

@RaduBerinde
Copy link
Member


pkg/sql/opt/optbuilder/scope.go, line 956 at r2 (raw file):

Previously, rytaft wrote…

Yea, I think you're right

The semaCtx is passed externally, and it's probably being reused (https://github.com/cockroachdb/cockroach/blob/master/pkg/sql/plan_opt.go#L232). I think we should restore.

This commit introduces support for FILTERing individual aggregates in a
group by.

One unintuitive change is that if we have a COUNT(*) with a FILTER, we
need to synthesize an input for the aggregation to be over, because
otherwise we have no input to hang the AggFilter off of.

Thus, we convert
  COUNT(*) FILTER (WHERE foo)
to
  COUNT(true) FILTER (WHERE foo).

This isn't a problem for DISTINCT because COUNT(DISTINCT *) is not valid.

Release note (sql change): FILTER expressions are now supported by the
cost-based optimizer.
@justinj justinj requested a review from a team January 22, 2019 15:51
@justinj
Copy link
Contributor Author

justinj commented Jan 22, 2019

Ok, fixed the restoring. TFTRs!

bors r+

craig bot pushed a commit that referenced this pull request Jan 22, 2019
34077: opt: add support for aggregate FILTER r=justinj a=justinj

This commit introduces support for FILTERing individual aggregates in a
group by.

One unintuitive change is that if we have a COUNT(*) with a FILTER, we
need to synthesize an input for the aggregation to be over, because
otherwise we have no input to hang the AggFilter off of.

Thus, we convert
  COUNT(*) FILTER (WHERE foo)
to
  COUNT(true) FILTER (WHERE foo).

This isn't a problem for DISTINCT because COUNT(DISTINCT *) is not valid.

Release note (sql change): FILTER expressions are now supported by the
cost-based optimizer.

Co-authored-by: Justin Jaffray <[email protected]>
@craig
Copy link
Contributor

craig bot commented Jan 22, 2019

Build succeeded

@craig craig bot merged commit ec3a451 into cockroachdb:master Jan 22, 2019
rmloveland added a commit to cockroachdb/docs that referenced this pull request Feb 7, 2019
A bit more work to address #3998.

Summary of changes:

- Edit the 'Supported statements' section of the ['Cost-based
  Optimizer'][1] page to add:

  - Add FILTER per cockroachdb/cockroach#34077

  - Add DELETE per cockroachdb/cockroach#34522

  - Add `INSERT .. ON CONFLICT` variants per cockroachdb/cockroach#33339

[1]: http://www.cockroachlabs.com/docs/v2.2/cost-based-optimizer.html
rmloveland added a commit to cockroachdb/docs that referenced this pull request Feb 7, 2019
A bit more work to address #3998.

Summary of changes:

- Edit the 'Supported statements' section of the ['Cost-based
  Optimizer'][1] page as follows:

  - Add FILTER per cockroachdb/cockroach#34077

  - Add DELETE per cockroachdb/cockroach#34522

  - Add `INSERT .. ON CONFLICT` variants per cockroachdb/cockroach#33339

  - Remove `experimental_optimizer_updates` cluster setting (can't find
    a commit for this, but I don't see it in `SHOW ALL` output on my
    local build of yesterday's `master`, version number is
    `v2.2.0-alpha.20181217-1096-gd104dcee69-dirty`.

[1]: http://www.cockroachlabs.com/docs/v2.2/cost-based-optimizer.html
rmloveland added a commit to cockroachdb/docs that referenced this pull request Feb 8, 2019
A bit more work to address #3998.

Summary of changes:

- Edit the 'Supported statements' section of the ['Cost-based
  Optimizer'][1] page as follows:

  - Add DELETE per cockroachdb/cockroach#34522

  - Add `INSERT .. ON CONFLICT` variants per cockroachdb/cockroach#33339

  - Add FILTER clause on aggregate functions per
    cockroachdb/cockroach#34077

  - Remove `experimental_optimizer_updates` cluster setting (can't find
    a commit for this, but I don't see it in `SHOW ALL` output on my
    local build of yesterday's `master`, version number is
    `v2.2.0-alpha.20181217-1096-gd104dcee69-dirty`.

[1]: http://www.cockroachlabs.com/docs/v2.2/cost-based-optimizer.html
rmloveland added a commit to cockroachdb/docs that referenced this pull request Feb 11, 2019
A bit more work to address #3998.

Summary of changes:

- Edit the 'Supported statements' section of the ['Cost-based
  Optimizer'][1] page as follows:

  - Add DELETE per cockroachdb/cockroach#34522

  - Add `INSERT .. ON CONFLICT` variants per cockroachdb/cockroach#33339

  - Add `SELECT`, `VALUES`, and `UNION` statements that do not include
    window functions

  - Add FILTER clause on aggregate functions per
    cockroachdb/cockroach#34077

  - Remove `experimental_optimizer_updates` cluster setting

[1]: http://www.cockroachlabs.com/docs/v2.2/cost-based-optimizer.html
rmloveland added a commit to cockroachdb/docs that referenced this pull request Feb 11, 2019
A bit more work to address #3998.

Summary of changes:

- Edit the 'Supported statements' section of the ['Cost-based
  Optimizer'][1] page as follows:

  - Add DELETE per cockroachdb/cockroach#34522

  - Add `INSERT .. ON CONFLICT` variants per cockroachdb/cockroach#33339

  - Add `SELECT`, `VALUES`, and `UNION` statements that do not include
    window functions

  - Add FILTER clause on aggregate functions per
    cockroachdb/cockroach#34077

  - Remove `experimental_optimizer_updates` cluster setting

[1]: http://www.cockroachlabs.com/docs/v2.2/cost-based-optimizer.html
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants