Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

opt: add rule to replace ST_Distance with ST_DWithin #53066

Merged
merged 2 commits into from
Aug 20, 2020

Conversation

DrewKimball
Copy link
Collaborator

ST_DWithin is equivalent to the expression ST_Distance <= x.
Similar reformulations can be made for different comparison operators
(<, >, >=).

This PR consists of two commits. The first commit adds an internal builtin
function ST_DWithinExclusive, which is equivalent to ST_Distance < x
but is able to exit early.

The second commit adds two rules that replace expressions of the form
ST_Distance <= x with either ST_DWithin or ST_DWithinExclusive
depending on the comparison operator used. This replacement is desirable
because the ST_DWithin function can exit early, while ST_Distance cannot
do the same. ST_DWithin can also be used to generate an inverted index scan.

Fixes #52028

Release note: None

@DrewKimball DrewKimball requested a review from a team as a code owner August 19, 2020 18:44
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@rytaft rytaft requested review from otan and a team August 19, 2020 20:37
Copy link
Contributor

@otan otan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good, optimiser bits look ok to me but I think @rytaft can take a closer look at it - one stylistic comment.
Also, if you felt up for it, we should probably do the same thing for st_maxdistance and st_dfullywithin.

Reviewed 7 of 7 files at r1, 3 of 3 files at r2.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @DrewKimball)


pkg/geo/geogfn/distance.go, line 204 at r1 (raw file):

	boundingBoxIntersects bool,
	stopAfter float64,
	exclusive bool,

these bool flags are a little unwildy. what do you think of adding some sort of bool

type DWithinExclusivity bool

const (
   DWithinExcluse DWithinExclusivity = true
   DWithinInclude DWithinExclusivity = false
)

and passing that type around instead?


pkg/geo/geogfn/dwithin_test.go, line 66 at r1 (raw file):

								dwithin, err := DWithin(a, b, val, subTC.useSphereOrSpheroid, true /* exclusive */)
								require.NoError(t, err)
								require.Equal(t, dwithin, exclusiveExpected)

nit: for testify Equal, it's require.Equal(t, expected, dwithin)

@otan
Copy link
Contributor

otan commented Aug 19, 2020


pkg/geo/geogfn/distance.go, line 204 at r1 (raw file):

Previously, otan (Oliver Tan) wrote…

these bool flags are a little unwildy. what do you think of adding some sort of bool

type DWithinExclusivity bool

const (
   DWithinExcluse DWithinExclusivity = true
   DWithinInclude DWithinExclusivity = false
)

and passing that type around instead?

*DWithinExclusive / DWithinInclusive

@otan
Copy link
Contributor

otan commented Aug 19, 2020


pkg/geo/geogfn/distance.go, line 307 at r1 (raw file):

		if !u.exclusive && u.minD <= u.stopAfter {
			return true
		} else if u.exclusive && u.minD < u.stopAfter {

nit: don't need the else if you already return above.

Copy link
Collaborator

@rytaft rytaft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! Thanks for doing this.

In order for st_dwithinexclusive to be index accelerated, it needs to be added to RelationshipMap in geo/geoindex/geoindex.go (you can map it to the RelationshipType DWithin). When the builtins get initialized, we automatically create a second version of each of these functions prepended with an underscore (these are explicitly not index accelerated), so I think you should just remove the underscore from the function name where you've defined it.

It would be good if you could also add a couple of tests in xform/testdata/rules/select and xform/testdata/rules/join where we test the other index-accelerated functions to confirm that all the variants of the rule you created can be index accelerated.

Reviewed 7 of 7 files at r1, 3 of 3 files at r2.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @DrewKimball)

@rytaft
Copy link
Collaborator

rytaft commented Aug 19, 2020


pkg/sql/sem/builtins/geo_builtins.go, line 4458 at r2 (raw file):

			},
			Info: infoBuilder{
				info: "Returns true if any of geometry_a is within distance units of geometry_b.",

Might want to update the info for each of these functions depending on the value of exclusive

@DrewKimball
Copy link
Collaborator Author

Is there anything I need to do additionally if I remove the underscore? If I do, users will be able to use st_dwithinexclusive. Though, I guess _st_dwithinexclusive could be directly used as well, anyway.

@rytaft
Copy link
Collaborator

rytaft commented Aug 19, 2020

Yea, I don't think it's a problem if users use it (especially if you update the info to document the difference between st_dwithin and st_dwithinexclusive)

Copy link
Collaborator Author

@DrewKimball DrewKimball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do I have to do anything beyond adding st_dwithinexclusive to the mapping? Are there any situations in which an inverted index scan would be generated without leaving the original filter on the select?

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @otan and @rytaft)


pkg/geo/geogfn/distance.go, line 204 at r1 (raw file):

Previously, otan (Oliver Tan) wrote…

*DWithinExclusive / DWithinInclusive

Yeah, I like that. I've put it in geo/geo.go; let me know if you want it somewhere else. I'm going to use Fn instead of DWithin since it isn't used only in the context of DWithin (so, it will look like geo.FnExclusivity).


pkg/geo/geogfn/distance.go, line 307 at r1 (raw file):

Previously, otan (Oliver Tan) wrote…

nit: don't need the else if you already return above.

Done.


pkg/geo/geogfn/dwithin_test.go, line 66 at r1 (raw file):

Previously, otan (Oliver Tan) wrote…

nit: for testify Equal, it's require.Equal(t, expected, dwithin)

Done.


pkg/sql/sem/builtins/geo_builtins.go, line 4458 at r2 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

Might want to update the info for each of these functions depending on the value of exclusive

Done.

Copy link
Collaborator

@rytaft rytaft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do I have to do anything beyond adding st_dwithinexclusive to the mapping?

Nope, that's it!

Are there any situations in which an inverted index scan would be generated without leaving the original filter on the select?

Nope, all geospatial index operations produce false positives, so we always need the original filter.

Reviewed 13 of 13 files at r3.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @DrewKimball and @otan)


pkg/geo/geogfn/dwithin_test.go, line 59 at r3 (raw file):

								require.True(t, dwithin)
							})
							t.Run(fmt.Sprintf("FnExclusive:%f", val), func(t *testing.T) {

did you mean to change this one?


pkg/geo/geogfn/dwithin_test.go, line 89 at r3 (raw file):

								require.True(t, dwithin)
							})
							t.Run(fmt.Sprintf("FnExclusive:%f", val), func(t *testing.T) {

ditto


pkg/geo/geogfn/dwithin_test.go, line 115 at r3 (raw file):

								require.False(t, dwithin)
							})
							t.Run(fmt.Sprintf("FnExclusive:%f", val), func(t *testing.T) {

ditto


pkg/sql/sem/builtins/geo_builtins.go, line 4440 at r3 (raw file):

	var info string
	if exclusive {
		info = "Returns true if any of %s is within distance units of geometry_b, exclusive."

Need a %s for geometry_b too, since that should be geography_b for a couple of them. Also one is meters and one is units. Might be better to just leave the original info fields as-is, but tack on an "exclusivity" string to the end....

Copy link
Collaborator Author

@DrewKimball DrewKimball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @otan and @rytaft)


pkg/geo/geogfn/dwithin_test.go, line 59 at r3 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

did you mean to change this one?

No, I did not. Thanks for catching that. Done.


pkg/geo/geogfn/dwithin_test.go, line 89 at r3 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

ditto

Done.


pkg/geo/geogfn/dwithin_test.go, line 115 at r3 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

ditto

Done.


pkg/sql/sem/builtins/geo_builtins.go, line 4440 at r3 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

Need a %s for geometry_b too, since that should be geography_b for a couple of them. Also one is meters and one is units. Might be better to just leave the original info fields as-is, but tack on an "exclusivity" string to the end....

Huh, missed that second part. Good idea. Done.

Copy link
Collaborator

@rytaft rytaft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 3 of 3 files at r4.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @otan)

@DrewKimball
Copy link
Collaborator Author

TFTRs!

@DrewKimball
Copy link
Collaborator Author

bors r+

This commit adds an internal geo_builtin function called
`ST_DWithinExclusive`. It is equivalent to `ST_Distance(a,b) < x`,
but is able to exit early.

Release note: None
`ST_DWithin` is equivalent to the expression `ST_Distance <= x`.
Similar reformulations can be made for different comparison operators
(<, >, >=). This commit adds two rules that replace expressions of the
latter form with either `ST_DWithin` or `ST_DWithinExclusive`. This
replacement is desirable because the `ST_DWithin` function can exit early.
`ST_DWithin` can also be used to generate an inverted index scan.

Fixes cockroachdb#52028

Release note: None
@craig
Copy link
Contributor

craig bot commented Aug 20, 2020

Canceled.

@DrewKimball
Copy link
Collaborator Author

bors r+

@craig
Copy link
Contributor

craig bot commented Aug 20, 2020

Build succeeded:

@craig craig bot merged commit 904b7cb into cockroachdb:master Aug 20, 2020
@DrewKimball DrewKimball deleted the st_dwithin branch August 20, 2020 23:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

opt: add normalization rule to convert ST_Distance(...) < x to ST_DWithin(...)
4 participants