Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Override doc_value parameter in Spatial XPack module #53286

Closed

Conversation

nknize
Copy link
Contributor

@nknize nknize commented Mar 9, 2020

This PR enables parsing the doc_value parameter in GeoShapeFieldMapper and ShapeFieldMapper when the xpack spatial module is loaded. Licensing is Basic.

The PR introduces the following behavior:

  • if the Spatial xpack module is available the doc_values parameter defaults to true for GeoShapeFieldMapper and ShapeFieldMapper
  • if the spatial xpack module is available the doc_values parameter throws an Exception for new geo_shape fields defined using the legacy PrefixTree indexing approach (same behavior as before)
  • if the Spatial xpack module is not available (e.g., OSS) the doc_values parameter defaults to false for GeoShapeFieldMapper and ShapeFieldMapper (same behavior as before)
  • if the spatial xpack module is not available (e.g., OSS) explicitly setting the doc_values parameter throws a MapperParsingException (same behavior as before)
  • backwards compatibility is supported

relates #37206

This commit enables parsing the doc_value parameter
in GeoShapeFieldMapper and ShapeFieldMapper when
the xpack spatial module is loaded. Licensing is Basic+.

This commit introduces the following behavior:

* doc_values defaults to true for GeoShapeFieldMapper and ShapeFieldMapper when the Spatial XPack module is available
* doc_values are not supported for geo_shape PrefixTree indexing (same behavior as before)
* doc_values defaults to false for GeoShapeFieldMapper and ShapeFieldMapper when the Spatial XPack module is not available (e.g., OSS)
* when doc_values are set in OSS, a MapperParsingException is thrown (same behavior as before)
@nknize nknize added >feature :Analytics/Geo Indexing, search aggregations of geo points and shapes v8.0.0 v7.7.0 labels Mar 9, 2020
@nknize nknize requested review from rjernst, jpountz and talevy March 9, 2020 14:53
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-analytics-geo (:Analytics/Geo)

@@ -69,6 +69,7 @@
public static final Explicit<Boolean> IGNORE_Z_VALUE = new Explicit<>(true, false);
}

protected static List<ParserHandler> PARSER_EXTENSIONS = new ArrayList<>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not think the parser-extension part of this PR is needed. I think that the lack of doc_values parsing support in the field mapper is a bug in our field-mapper infrastructure, not a feature that needs to be extended by a plugin.

I would want to see the equivalent to this PR: #47519 getting merged into master, so that this PR can focus on the Indexer extension points. what do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would work except the builder is not overridden in xpack. It's the responsibility of the parser-extension (parser.config) to throw "doc values not supported" exceptions at mapping time as opposed to throwing them at index time. So when geo_shape doc values are moved to xpack we will want OSS behavior to remain the same (throwing doc values not supported exceptions) and xpack to implement them.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, so these handlers must be passed in, but can we re-use the data-handler for this instead of creating a generic parsing abstraction? Would it be better to limit scope to just changing how doc-values are parsed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I originally had it that way but changed it because the data handler logic is specific to GeoShapeFieldMapper only (since index logic for both ShapeFieldMapper and LegacyGeoShapeFieldMapper is untouched). But parsing logic needs to be handled by both GeoShapeFieldMapper and ShapeFieldMapper so it was elevated to the AbstractGeometryFieldMapper in order to share the implementation (and not duplicate in ShapeFieldMapper).

This is just the start to the "tangled mess" I was referring to in today's meeting. It gets more hairy when projections (Geo3DPoint XYPoint LatLonPoint) are brought in the fray.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it work if there were dummy data-handlers/indexers unique for ShapeFieldMapper and LegacyGeoShapeFieldMapper?

regardless of how the extension point is implemented, I feel like it is important that all these field mappers understand how to parse and proclaim their lack of support for doc-values.

Do you see any reason why I should not add support for parsing doc_values to server's implementation?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the implementation is mostly based on preference? I don't see a reason to not add support for parsing doc_values in server's implementation (and throwing a MapperParsingException in OSS deployments). I ultimately didn't go this route because I liked the existing behavior where unimplemented parameters are handled further upstream.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like @talevy I have a preference for not making parsing pluggable. We might need to reconsider in the future with projections, but I'd like to explore the path that requires the least amount of new abstractions for now?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I opened #53351 to support parsing doc_values. There are two places I left TODO comments where I'd like to be sure this data-handler branch can properly hook into things.

let me know what you think!

Copy link
Contributor Author

@nknize nknize Mar 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 I cherry-picked #53351 and wired up the data handler. Next will make the change to not register the default data handler unless no plugins are installed.

if (licenseState != null && licenseState.isSpatialAllowed()) {
// HACK: override the default data handler factory, this is trappy because other plugins could override
GeoShapeFieldMapper.DATA_HANDLER_FACTORIES.replace(
GeoShapeFieldMapper.Defaults.DATA_HANDLER.value(), () -> new SpatialDataHandler());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about this alternative approach:

  • have a default data handler in server/ but don't register it, it is only used as a fallback if no plugins register a data handler
  • fail the node if there is more than one registered data handler
  • default doc values to true if, and only if there is one registered handler

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 I like that approach better since it doesn't involve squashing the data handler.

@@ -69,6 +69,7 @@
public static final Explicit<Boolean> IGNORE_Z_VALUE = new Explicit<>(true, false);
}

protected static List<ParserHandler> PARSER_EXTENSIONS = new ArrayList<>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like @talevy I have a preference for not making parsing pluggable. We might need to reconsider in the future with projections, but I'd like to explore the path that requires the least amount of new abstractions for now?

talevy and others added 3 commits March 11, 2020 00:52
This PR adds support for the `doc_values` field mapping parameter.

`false` is currently the only supported value to explicitly
set, and is also the default.
@bpintea bpintea added v7.8.0 and removed v7.7.0 labels Mar 25, 2020
@rjernst rjernst added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label May 4, 2020
@nknize nknize closed this May 12, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/Geo Indexing, search aggregations of geo points and shapes >feature Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v7.9.0 v8.0.0-alpha1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants