Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

exec: support comparisons between all numeric types #39778

Merged
merged 2 commits into from
Aug 26, 2019

Conversation

rafiss
Copy link
Collaborator

@rafiss rafiss commented Aug 20, 2019

The vectorized engine now supports comparisons between floats, ints, and
decimals.

touches #39189

Release note: None

@rafiss rafiss requested review from solongordon and a team August 20, 2019 19:41
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@rafiss rafiss changed the title exec: support type comparisons between all numeric types exec: support comparisons between all numeric types Aug 20, 2019
Copy link
Member

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice work! I think we should address the TODO about reusing the same tmpDec to avoid an allocation on each comparison before this merges, but if you push back, I'll be ok with that as well.

Reviewed 3 of 3 files at r1.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @rafiss and @solongordon)


pkg/sql/exec/execgen/cmd/execgen/overloads.go, line 271 at r1 (raw file):

// registerTypeCustomizer registers a particular type customizer to a type, for
// usage by templates.
func registerTypeCustomizer(pair coltypePair, customizer typeCustomizer) {

[nit]: the comment needs an adjustment.


pkg/sql/exec/execgen/cmd/execgen/overloads.go, line 321 at r1 (raw file):

// decimalFloatCustomizer supports mixed type expressions with a float left-hand
// side and a decimal right-hand side.
type floatDecimalCustomizer struct{}

[nit]: s/decimalFloat/floatDecimal/ (in the comment).


pkg/sql/exec/execgen/cmd/execgen/overloads.go, line 325 at r1 (raw file):

// decimalIntCustomizer supports mixed type expressions with an int left-hand
// side and a decimal right-hand side.
type intDecimalCustomizer struct{}

ditto (in the comment).


pkg/sql/exec/execgen/cmd/execgen/overloads.go, line 602 at r1 (raw file):

		buf := strings.Builder{}
		// todo(rafi): is there a way to avoid an allocating on each comparison?
		t := template.Must(template.New("").Parse(`

Maybe we could use a global variable here that would live in the template that is using the comparison functions? This doesn't sound good though.

Another thing is that maybe we would be ok with defining a global scratch struct that is shared among all templates and is populated into every generated file?


pkg/sql/exec/execgen/cmd/execgen/overloads.go, line 710 at r1 (raw file):

	// Use an intCustomizer of appropriate width when widths are the same.
	registerTypeCustomizer(coltypePair{coltypes.Int8, coltypes.Int8}, intCustomizer{width: 8})

[question]: a comparison between ints with different widths is unsupported right now, right?

Copy link
Collaborator Author

@rafiss rafiss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @solongordon and @yuzefovich)


pkg/sql/exec/execgen/cmd/execgen/overloads.go, line 271 at r1 (raw file):

Previously, yuzefovich wrote…

[nit]: the comment needs an adjustment.

done


pkg/sql/exec/execgen/cmd/execgen/overloads.go, line 321 at r1 (raw file):

Previously, yuzefovich wrote…

[nit]: s/decimalFloat/floatDecimal/ (in the comment).

thanks, nice catch


pkg/sql/exec/execgen/cmd/execgen/overloads.go, line 325 at r1 (raw file):

Previously, yuzefovich wrote…

ditto (in the comment).

done


pkg/sql/exec/execgen/cmd/execgen/overloads.go, line 602 at r1 (raw file):

Previously, yuzefovich wrote…

Maybe we could use a global variable here that would live in the template that is using the comparison functions? This doesn't sound good though.

Another thing is that maybe we would be ok with defining a global scratch struct that is shared among all templates and is populated into every generated file?

i was wondering about the global variable approach. i'm not sure if there are any concurrency concerns. i'll think of how we could use a templated scratch struct. even one allocation per batch would be way better than how it is in this revision.


pkg/sql/exec/execgen/cmd/execgen/overloads.go, line 710 at r1 (raw file):

Previously, yuzefovich wrote…

[question]: a comparison between ints with different widths is unsupported right now, right?

it is supported -- the loop right above this one handles that. for int comparisons, everything is just casted to int64. i included the same-width customizers for the binary operator overloads and doing overflow checks appropriately.

Copy link
Member

@jordanlewis jordanlewis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work. Thanks for doing this. I'd like to see some basic regression tests as well (maybe that's just enabling more logic test configs?).

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @rafiss, @solongordon, and @yuzefovich)


pkg/sql/exec/execgen/cmd/execgen/overloads.go, line 602 at r1 (raw file):

Previously, rafiss (Rafi Shamim) wrote…

i was wondering about the global variable approach. i'm not sure if there are any concurrency concerns. i'll think of how we could use a templated scratch struct. even one allocation per batch would be way better than how it is in this revision.

There would definitely be concurrency concerns with that idea. Ideally, we could figure out a way to do as you say and make an allocation per batch. If that was too slow (which I doubt) we could pool the temp decimals into a sync.Pool and acquire and release one once per batch. However, I don't see an easy way to make this work - it's challenging to make assumptions about the outer scope of templates in our hacky language. I like the idea about a global scratch struct, I think.

I wouldn't object to seeing the first version of this go in with the poor allocation behavior, though. I think for the rest of this release we'd be wise to focus on breadth rather than performance - as I saw with the hash joiner, even adding a pointless allocation on every single call to the hash function on both probe and build side left the hash joiner in a state that was 10x faster than pre-vectorized.


pkg/sql/exec/execgen/cmd/execgen/selection_ops_gen.go, line 31 at r1 (raw file):

	"github.com/cockroachdb/cockroach/pkg/col/coldata"
	"github.com/cockroachdb/cockroach/pkg/col/coltypes"
	"github.com/cockroachdb/cockroach/pkg/sql/exec/execerror"

nit: this seems somewhat out of place in this commit? probably doesn't matter though or else the compiler would have complained.

Copy link
Member

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis, @rafiss, @solongordon, and @yuzefovich)


pkg/sql/exec/execgen/cmd/execgen/overloads.go, line 602 at r1 (raw file):

Previously, jordanlewis (Jordan Lewis) wrote…

There would definitely be concurrency concerns with that idea. Ideally, we could figure out a way to do as you say and make an allocation per batch. If that was too slow (which I doubt) we could pool the temp decimals into a sync.Pool and acquire and release one once per batch. However, I don't see an easy way to make this work - it's challenging to make assumptions about the outer scope of templates in our hacky language. I like the idea about a global scratch struct, I think.

I wouldn't object to seeing the first version of this go in with the poor allocation behavior, though. I think for the rest of this release we'd be wise to focus on breadth rather than performance - as I saw with the hash joiner, even adding a pointless allocation on every single call to the hash function on both probe and build side left the hash joiner in a state that was 10x faster than pre-vectorized.

Yeah, this makes sense.


pkg/sql/exec/execgen/cmd/execgen/overloads.go, line 710 at r1 (raw file):

Previously, rafiss (Rafi Shamim) wrote…

it is supported -- the loop right above this one handles that. for int comparisons, everything is just casted to int64. i included the same-width customizers for the binary operator overloads and doing overflow checks appropriately.

Oh yeah, thanks, I missed that.


pkg/sql/exec/execgen/cmd/execgen/selection_ops_gen.go, line 31 at r1 (raw file):

Previously, jordanlewis (Jordan Lewis) wrote…

nit: this seems somewhat out of place in this commit? probably doesn't matter though or else the compiler would have complained.

I think this is needed because we're calling execerror.NonVectorizedPanic in the template for some of the customizers.

Copy link
Collaborator Author

@rafiss rafiss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah yes, i meant to note in the PR comments that i was still working on testing it all.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis, @rafiss, @solongordon, and @yuzefovich)


pkg/sql/exec/execgen/cmd/execgen/selection_ops_gen.go, line 31 at r1 (raw file):

Previously, yuzefovich wrote…

I think this is needed because we're calling execerror.NonVectorizedPanic in the template for some of the customizers.

yeah, the import is required for the reason Yahor said.

@rafiss rafiss force-pushed the more-mixed-type-comparisons branch 2 times, most recently from e89847b to ec1a3e6 Compare August 23, 2019 22:04
@rafiss rafiss requested a review from a team August 23, 2019 22:11
@rafiss
Copy link
Collaborator Author

rafiss commented Aug 23, 2019

This is ready for another review.

Copy link
Member

@jordanlewis jordanlewis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

I think we could probably manage to accomplish more and better testing with some kind of randomized strategy akin to what @mjibson was mentioning in Slack today, but that can wait until later I think.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @jordanlewis, @rafiss, @solongordon, and @yuzefovich)

The vectorized engine now supports comparisons between floats, ints, and
decimals.

Release note: None
@rafiss rafiss force-pushed the more-mixed-type-comparisons branch from ec1a3e6 to 6da62ea Compare August 26, 2019 15:16
Copy link
Collaborator Author

@rafiss rafiss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @jordanlewis, @rafiss, @solongordon, and @yuzefovich)


pkg/sql/exec/execgen/cmd/execgen/overloads.go, line 602 at r1 (raw file):

Previously, yuzefovich wrote…

Yeah, this makes sense.

is assuming things about the outer scope of templates much different than what we currently do, where templates assume that the correct packages are imported by the outer scope.

it doesn't seem so to me, so maybe the approach i'll take later on is to just assume that the global "decimal pool" struct/package is in scope in this templated code.

Copy link
Member

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 1 of 1 files at r2, 3 of 3 files at r3.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @jordanlewis, @rafiss, @solongordon, and @yuzefovich)


pkg/sql/exec/overloads_test.go, line 155 at r3 (raw file):

}

func TestFloatComparison(t *testing.T) {

[nit]: it seems like TestIntComparison is missing (or at least a comment is about why the test is missing).


pkg/sql/exec/execgen/cmd/execgen/overloads_test_utils_gen.go, line 23 at r3 (raw file):

import (
	"bytes"
  "math"

[super nit]: seems like math package is misaligned.

Copy link
Collaborator Author

@rafiss rafiss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @jordanlewis, @rafiss, @solongordon, and @yuzefovich)


pkg/sql/exec/overloads_test.go, line 155 at r3 (raw file):

Previously, yuzefovich wrote…

[nit]: it seems like TestIntComparison is missing (or at least a comment is about why the test is missing).

initially i had intentionally omitted it since int comparison is tested by the other tests. but now that i think about it, we have different code for int left-hand-side versus right, so i should have that test as well.


pkg/sql/exec/execgen/cmd/execgen/overloads_test_utils_gen.go, line 23 at r3 (raw file):

Previously, yuzefovich wrote…

[super nit]: seems like math package is misaligned.

done, it was a tab vs space issue

@rafiss rafiss force-pushed the more-mixed-type-comparisons branch from 6da62ea to 83c4891 Compare August 26, 2019 16:14
@rafiss
Copy link
Collaborator Author

rafiss commented Aug 26, 2019

thanks for all the reviews!

bors r+

craig bot pushed a commit that referenced this pull request Aug 26, 2019
39778: exec: support comparisons between all numeric types r=rafiss a=rafiss

The vectorized engine now supports comparisons between floats, ints, and
decimals.

touches #39189

Release note: None

40197: pgwire: improve multiple portal error msgs r=jordanlewis a=jordanlewis

And relink to new issue (#40195).

Release note: None

40203: builtins: fix bug in aclexplode; add tests r=jordanlewis a=jordanlewis

Closes #39794.

Release note: None

Co-authored-by: Rafi Shamim <[email protected]>
Co-authored-by: Jordan Lewis <[email protected]>
@craig
Copy link
Contributor

craig bot commented Aug 26, 2019

Build succeeded

@craig craig bot merged commit 83c4891 into cockroachdb:master Aug 26, 2019
@rafiss rafiss deleted the more-mixed-type-comparisons branch August 29, 2019 14:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants