Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make Geo Context Parsing More Strict #32412

Closed

Conversation

imotov
Copy link
Contributor

@imotov imotov commented Jul 26, 2018

Currently, if geo context is represented by something other than
geo_point or an object with lat and lon fields, the parsing of it
as a geo context can result in ignoring the context altogether,
returning confusing errors such as number_format_exception or trying
to parse the number specifying as long-encoded hash code. It would also
fail if the geo_point was stored.

This commit makes the parsing more strict and will fail the indexing
if the context is not specified correctly. It also adds support for
stored geo_points and removes support for long geo_hashes in numeric
and string types since it doesn't seem to be intentional.

In case of objects the new change will accept objects that contain
lat and lon fields:

{
  "loc": {
    "lat": 12,
    "lon": 34
  }
}

It will reject objects that contain any other fields without lat and
lon:

{
  "loc": {
    "foo": 12,
    "bar": 34
  }
}

And it will ignore empty objects if other contexts for this suggestion are
provided:

{
  "loc": {

  }
}

Closes #32202

imotov added 2 commits July 26, 2018 14:09
Currently, if geo context is represented by something other than
geo_point or an object with lat and lon fields, the parsing of it
as a geo context can result in ignoring the context altogether,
returning confusing errors such as number_format_exception or trying
to parse the number specifying as long-encoded hash code. It would also
fail if the geo_point was stored.

This commit makes the parsing more strict and will fail the indexing
if the context is not specified correctly. It also adds support for
stored geo_points.

Closes elastic#32202
@imotov imotov added >bug >breaking :Analytics/Geo Indexing, search aggregations of geo points and shapes :Search Relevance/Suggesters "Did you mean" and suggestions as you type v7.0.0 labels Jul 26, 2018
@imotov imotov requested a review from nknize July 26, 2018 18:48
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search-aggs

@imotov imotov requested a review from jimczi July 31, 2018 15:50
Copy link
Contributor

@nknize nknize left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM Geo Context Suggester hasn't had much work on it in a while. Thanks for doing this @imotov

} else {
// Does this object exist? If it does and we didn't find what we need - we need to warn user
for (IndexableField field : document.getFields()) {
if (field.name().startsWith(fieldName + ".")) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know the "dots in field names" discussion has been a long running one. Do we not yet have a more general/graceful way of throwing these exceptions? This is more of a question out of my own curiosity and not intended to hold up the PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure we should do that, can we restrict the geo context to geo fields and validate this assumption from the mapping (context.mapperService) directly ? If the field has the correct type in the mapping we don't have to check why it's missing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can definitely remove that and clean this up even more, but this will be much more breaking change, since today you can have this sudo geo object and it works, if we will remove this in 7, then working indices with this, created in 6 are going to break. I am not sure we can do that. We will probably have to disable it for indices in 7 but I don't think we can fully remove it until 8.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd treat this case like the geohash string version, this is not documented nor tested so we should remove it. The documentation is clear, you can use a path that references a geo_point field or set the point directly in the context.

Copy link
Contributor

@jimczi jimczi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay @imotov . I left some comments.

@@ -200,18 +201,29 @@ protected XContentBuilder toInnerXContent(XContentBuilder builder, Params params
geohashes.add(stringEncode(spare.getLon(), spare.getLat(), precision));
}
}
} else {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the code above seems useless in 7 (even in 6.x ?) since we have a LatLonPoint and a LatLonDocValuesField ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way I understood the code above is if you want to represent lat and lon as an object, but don't want to index it as an actual geopoint you could do that.

} else {
// Does this object exist? If it does and we didn't find what we need - we need to warn user
for (IndexableField field : document.getFields()) {
if (field.name().startsWith(fieldName + ".")) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure we should do that, can we restrict the geo context to geo fields and validate this assumption from the mapping (context.mapperService) directly ? If the field has the correct type in the mapping we don't have to check why it's missing.

@imotov
Copy link
Contributor Author

imotov commented Aug 3, 2018

@jimczi I pushed some changes. Not really sure if I understood your idea correctly. But it might be easier to collaborate on the code. Could you take another look?

Copy link
Contributor

@jimczi jimczi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We discussed with @imotov, this is a breaking change so we should apply the strict parsing only on new indices and log a deprecation warning for indices created in 6x.

@imotov
Copy link
Contributor Author

imotov commented Aug 9, 2018

Discussed this a bit more with @nik9000. It looks like there might be a way to move this check into mapping phase instead of indexing phase. I will give it a try.

@imotov
Copy link
Contributor Author

imotov commented Aug 10, 2018

I think I figured it out, but since it's complete rewrite with a completely new approach and no shared history I am going to open another PR. Closing this one for now.

@imotov imotov closed this Aug 10, 2018
imotov added a commit that referenced this pull request Aug 17, 2018
Currently, if geo context is represented by something other than
geo_point or an object with lat and lon fields, the parsing of it
as a geo context can result in ignoring the context altogether,
returning confusing errors such as number_format_exception or trying
to parse the number specifying as long-encoded hash code. It would also
fail if the geo_point was stored.

This commit makes the mapping parsing more strict and will fail during
mapping update or index creation if the geo context doesn't point to
a geo_point field.

Supersedes #32412

Closes #32202
@jpountz jpountz removed the :Analytics/Geo Indexing, search aggregations of geo points and shapes label Jan 29, 2019
@imotov imotov deleted the issue-32202-incorrect-geo-context-path branch May 1, 2020 22:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>breaking >bug :Search Relevance/Suggesters "Did you mean" and suggestions as you type
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants