-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SQL: Add ST_WktToSQL function #35416
Conversation
Adds support for ST_WktToSQL function which accepts a string and parses it as WKT representation of a geoshape. Relates to elastic#29872
Pinging @elastic/es-search-aggs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some comments, but keep in mind I'm unfamiliar with the geo area...
Moreover, it would be nice to add a unit tests for these two cases:
https://github.com/elastic/elasticsearch/pull/35416/files#diff-758d3813c9c39f1bfd869ba1dcdfe045R39
https://github.com/elastic/elasticsearch/pull/35416/files#diff-758d3813c9c39f1bfd869ba1dcdfe045R44
For completeness I would add a test to QueryTranslatorTests
to check the translsation to painless script.
|
||
selectAsWKT | ||
// tag::wkttosql | ||
SELECT ST_AsWKT(ST_WKTToSQL('POINT (10 20)')) location; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you have to wrap it with ST_AsWKT
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this test we compare the result with csvjdbc result. The csvjdbc driver cannot deal with geoshapes. So, I have to convert the column into some type that csvjdbc understands.
|
||
city:s | location:s | ||
Amsterdam |point (4.850311987102032 52.347556999884546) | ||
// end::aswkt | ||
; | ||
|
||
selectAsWKT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please fix test name (it's same as above)
@@ -403,4 +405,8 @@ public static String aswktPoint(Object v) { | |||
public static String aswktShape(Object v) { | |||
return GeoProcessor.GeoOperation.ASWKT_SHAPE.apply(v).toString(); | |||
} | |||
|
|||
public static GeoShape wktToSql(String v) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor: you could rename the param to wkt
public Object process(Object input) { | ||
return StWkttosqlProcessor.apply(input); | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor: extra empty line
@@ -102,3 +102,32 @@ SELECT COUNT(city) count, CAST(SUBSTRING(ST_ASWKT(location), 8, 1) = '-' AS STRI | |||
9 |false | |||
6 |true | |||
; | |||
|
|||
selectFakeCoordinatesFromCityAndRegion | |||
SELECT region, city, ST_ASWKT(ST_WKTTOSQL(CONCAT(CONCAT(CONCAT(CONCAT('POINT (', CAST(LENGTH(city) AS STRING)), ' '), CAST(LENGTH(region) AS STRING)), ')'))) loc FROM geo ORDER BY region, city; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't get why you need this complexity, but I'm also unfamiliar with area.
Isn't it better to add real coordinate date to the table (index) ?
Also again you wrap it with a call to ST_ASWKT
, why is that? Shouldn't GeoShape implement a toString()
method so that it returns a human readable string as requited?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@matriv thanks for the review! I think I found a way to address your comments. Could you take another look? |
retest this please |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
/** | ||
* Constructs geometric objects from their WTK representations | ||
*/ | ||
public class StWkttosql extends UnaryScalarFunction { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be worth having a base Geo function that handles the validation and the return type so that subtypes only execute the actual action. If the pattern emerges, just passing the processor might be good enough (similar to the BinaryNumericFunction
).
Just mentioning this for the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I did that for ST_AsWKT
because I saw several functions that take Geometry as the only parameter. However, in this case ST_WKTToSQL
is somewhat unique, so I have decided to wait until pattern emerges. I might need to refactor it, but I would like to add a few more functions first.
|
||
#### Classes | ||
|
||
class org.elasticsearch.xpack.sql.expression.function.scalar.geo.GeoShape { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we pass the native ES geo type (ShapeBuilder
) directly or is there anything else SQL specific in there?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ShapeBuilder is using GeoJSON for toXContent() and pretty useless debugging output in toString(), which complicates the serialization story sinice we need to use WKT in both. At the moment this is the whole purpose of this wrapper. Basically I am trying to avoid adding a whole bunch of
if (obj instanceof ShapeBuilder) {
... use different serialization for ShapeBuilder ...
} else {
... normal serialization for pretty much everything else ...
}
I might also need to fold geo_point into this (not sure about this yet).
|
||
selectWKTToSQL | ||
// tag::wkttosql | ||
SELECT CAST(ST_WKTToSQL('POINT (10 20)') AS STRING) location; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we might be able to get away from this. For intervals, I've basically changed the way we do assertion where the actual type is checked and if it's of a custom type, a custom serialization takes place to check the strings.
In other words there would be no casting, CSV would read the result as a String but when the check is done, the geo type is detected, the result casted as the string and the two strings asserted.
No need to hold the PR for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool. I will definitely update this test when it lands in master.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
retest this please |
Bumping the version label to |
Adds support for ST_WktToSQL function which accepts a string and parses
it as WKT representation of a geoshape.
Relates to #29872