Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

opt: generate spans for IN filters #26708

Merged
merged 1 commit into from
Jun 14, 2018
Merged

Conversation

justinj
Copy link
Contributor

@justinj justinj commented Jun 13, 2018

This code is largely based off of makeSpansForTupleIn, though I wasn't
able to reuse much of it since the context is slightly different.

Also fix up a CREATE STATISTICS statement and modify a TPCC query.

Release note: None

@justinj justinj requested a review from a team as a code owner June 13, 2018 21:16
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@justinj
Copy link
Contributor Author

justinj commented Jun 13, 2018

@RaduBerinde I can rework this if you'd prefer we did non-tight single-col spans, I had the impression from the conversation that we settled on this approach but if you think single-cols would work better then I'm happy to change it.

@RaduBerinde
Copy link
Member

The single composite span is ok for now. We can always go back and add single-col spans if we need that.


Review status: :shipit: complete! 0 of 0 LGTMs obtained


pkg/sql/opt/memo/constraint_builder.go, line 212 at r1 (raw file):

			elem := val.Child(colIdxsInLHS[j])
			if !elem.IsConstValue() {
				return unconstrained, false

We may still be able to generate a constraint on other columns (e.g. (a, b) IN ((1, x)) => a=1. We should check this in the code where we decide which cols participate.


pkg/sql/opt/memo/constraint_builder.go, line 225 at r1 (raw file):

		// constraints.
		if hasNull {
			continue

I think it's no longer tight in this case.
(x, y) IN ((1, 2), (NULL, 4)) is not the same as (x, y) IN ((1,2)), the latter is NULL on (3,4)

At least not the way we define tight above. Maybe it's enough to say it's tight if it's true IFF the constraints are true. Maybe make it not tight here but add a TODO to revisit the definition of "tight".


Comments from Reviewable

@justinj
Copy link
Contributor Author

justinj commented Jun 13, 2018

Review status: :shipit: complete! 0 of 0 LGTMs obtained


pkg/sql/opt/memo/constraint_builder.go, line 212 at r1 (raw file):

Previously, RaduBerinde wrote…

We may still be able to generate a constraint on other columns (e.g. (a, b) IN ((1, x)) => a=1. We should check this in the code where we decide which cols participate.

Done.


pkg/sql/opt/memo/constraint_builder.go, line 225 at r1 (raw file):

Previously, RaduBerinde wrote…

I think it's no longer tight in this case.
(x, y) IN ((1, 2), (NULL, 4)) is not the same as (x, y) IN ((1,2)), the latter is NULL on (3,4)

At least not the way we define tight above. Maybe it's enough to say it's tight if it's true IFF the constraints are true. Maybe make it not tight here but add a TODO to revisit the definition of "tight".

Ah, good call, I hadn't considered that distinction. Done.


Comments from Reviewable

@RaduBerinde
Copy link
Member

:lgtm:


Review status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale)


pkg/sql/opt/memo/constraint_builder.go, line 217 at r2 (raw file):

			// If one of the LHS entries is not a bare column then our constraints
			// are no longer tight.
			tight = false

[nit] it's annoying to have to set this in two places, maybe just do tight = (len(constrainedCols) == lhs.ChildCount()) after the loop.


pkg/sql/opt/memo/constraint_builder.go, line 247 at r2 (raw file):

			//   (x, y) IN ((1, 2), (NULL, 4))
			// is not the same as
			//   (x, y) IN ((1,2)),

[nit] 1, 2 so it matches the one above


pkg/sql/opt/memo/constraint_builder.go, line 255 at r2 (raw file):

		key := constraint.MakeCompositeKey(vals...)
		sp.Init(key, constraint.IncludeBoundary, key, constraint.IncludeBoundary)
		c = c.Union(cb.evalCtx, constraint.SingleSpanConstraint(&keyCtx, &sp))

You could use Spans (like the single column IN does) and use Spans.SortAndMerge to avoid this being quadratic. Can just add a TODO though.


Comments from Reviewable

@justinj
Copy link
Contributor Author

justinj commented Jun 14, 2018

Review status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale)


pkg/sql/opt/memo/constraint_builder.go, line 217 at r2 (raw file):

Previously, RaduBerinde wrote…

[nit] it's annoying to have to set this in two places, maybe just do tight = (len(constrainedCols) == lhs.ChildCount()) after the loop.

Done.


pkg/sql/opt/memo/constraint_builder.go, line 247 at r2 (raw file):

Previously, RaduBerinde wrote…

[nit] 1, 2 so it matches the one above

Done.


pkg/sql/opt/memo/constraint_builder.go, line 255 at r2 (raw file):

Previously, RaduBerinde wrote…

You could use Spans (like the single column IN does) and use Spans.SortAndMerge to avoid this being quadratic. Can just add a TODO though.

Pretty straightforward change! Done.


Comments from Reviewable

@RaduBerinde
Copy link
Member

:lgtm:


Reviewed 3 of 5 files at r1, 1 of 2 files at r2, 1 of 1 files at r3.
Review status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale)


Comments from Reviewable

@justinj
Copy link
Contributor Author

justinj commented Jun 14, 2018

TFTR!

bors r+

@craig
Copy link
Contributor

craig bot commented Jun 14, 2018

Build failed

@andy-kimball
Copy link
Contributor

:lgtm:


Review status: :shipit: complete! 1 of 0 LGTMs obtained (and 1 stale)


pkg/sql/opt/exec/execbuilder/testdata/select_index, line 470 at r3 (raw file):

·          spans     /1/4-/1/5  ·       ·

# TODO(radu): in this case, we're not preferring abcd, unlike 2.0

NIT: Remove comment now that this is fixed.


pkg/sql/opt/memo/constraint_builder.go, line 193 at r3 (raw file):

	constrainedCols := make([]opt.OrderingColumn, 0, lhs.ChildCount())
	colIdxsInLHS := make([]int, 0, lhs.ChildCount())
	for i, n := 0, lhs.ChildCount(); i < n; i++ {

What's the purpose of setting n rather than saying i < lhs.ChildCount()? My understanding is that Go will only evaluate lhs.ChildCount() once at the beginning of the loop, so I think just makes the code harder to understand without improving perf.


pkg/sql/opt/memo/testdata/logprops/constraints, line 760 at r3 (raw file):

                └── tuple [type=tuple{int, int}]
                     ├── const: 5 [type=int]
                     └── const: 100 [type=int]

What about test where RHS contains something that's not a Tuple?

Also, to make sure testing is complete, one thing I often do is deliberately introduce subtle bugs into different parts of the code I'm testing, and make sure that one of my tests fails.


pkg/sql/opt/xform/testdata/external/tpcc, line 604 at r3 (raw file):

 ├── prune: (1,3,8,14-17)
 └── project
      ├── columns: stock.s_i_id:1(int!null) stock.s_quantity:3(int) stock.s_dist_05:8(string) stock.s_ytd:14(int) stock.s_order_cnt:15(int) stock.s_remote_cnt:16(int) stock.s_data:17(string)

Nice.


Comments from Reviewable

This code is largely based off of makeSpansForTupleIn, though I wasn't
able to reuse much of it since the context is slightly different.

Also fix up a CREATE STATISTICS statement and modify a TPCC query.

Release note: None
@justinj
Copy link
Contributor Author

justinj commented Jun 14, 2018

Review status: :shipit: complete! 0 of 0 LGTMs obtained (and 2 stale)


pkg/sql/opt/memo/constraint_builder.go, line 193 at r3 (raw file):

Previously, andy-kimball (Andy Kimball) wrote…

What's the purpose of setting n rather than saying i < lhs.ChildCount()? My understanding is that Go will only evaluate lhs.ChildCount() once at the beginning of the loop, so I think just makes the code harder to understand without improving perf.

We talked on slack - looking at the assembly it does look like it will re-evaluate ChildCount each time.


pkg/sql/opt/memo/testdata/logprops/constraints, line 760 at r3 (raw file):
I'm unclear on whether or not any such situations will get through typechecking, I actually tried to find one but couldn't, although postgres allows this (but we don't):

> SELECT (3, 50) IN ((1, 2), (select (3, 50)), (5, 100));

so I'll add a check for safety just in case this changes.


Comments from Reviewable

@justinj
Copy link
Contributor Author

justinj commented Jun 14, 2018

bors r+

craig bot pushed a commit that referenced this pull request Jun 14, 2018
26708: opt: generate spans for IN filters r=justinj a=justinj

This code is largely based off of makeSpansForTupleIn, though I wasn't
able to reuse much of it since the context is slightly different.

Also fix up a CREATE STATISTICS statement and modify a TPCC query.

Release note: None

Co-authored-by: Justin Jaffray <[email protected]>
@craig
Copy link
Contributor

craig bot commented Jun 14, 2018

Build succeeded

@craig craig bot merged commit 55ac430 into cockroachdb:master Jun 14, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants