Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle runtime subfields when shadowing dynamic mappings #75595

Merged
merged 5 commits into from
Jul 22, 2021

Conversation

romseygeek
Copy link
Contributor

In #75454 we changed our dynamic shadowing logic to check that an unmapped
field was truly shadowed by a runtime field before returning no-op mappers. However,
this does not handle the case where the runtime field can have multiple subfields, as
will be true for the upcoming composite field type. We instead need to check that
the field in question would not be shadowed by any field type returned by any
runtime field.

This commit abstracts this logic into a new isShadowed() method on
DocumentParserContext, which uses a set of runtime field type names built from
the mapping lookup at construction time. It also simplifies the no-op mapper
slightly by making it a singleton object, as we don't need to preserve field names
here.

@romseygeek romseygeek added >enhancement :Search Foundations/Mapping Index mappings, including merging and defining field types v8.0.0 v7.15.0 labels Jul 21, 2021
@romseygeek romseygeek requested a review from cbuescher July 21, 2021 15:18
@romseygeek romseygeek self-assigned this Jul 21, 2021
@elasticmachine elasticmachine added the Team:Search Meta label for search team label Jul 21, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search (Team:Search)

@romseygeek
Copy link
Contributor Author

@elasticmachine update branch

Copy link
Member

@cbuescher cbuescher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, I left two small comments/questions but please address at your own discretion

@@ -308,7 +308,7 @@ public void testDualingQueries() throws IOException {
String source = "{\"foo\": " + values + "}";
XContentParser parser = createParser(JsonXContent.jsonXContent, source);
SourceToParse sourceToParse = new SourceToParse("test", "test", new BytesArray(source), XContentType.JSON);
DocumentParserContext ctx = new TestDocumentParserContext(null, null, null, null, sourceToParse) {
DocumentParserContext ctx = new TestDocumentParserContext(MappingLookup.EMPTY, null, null, null, sourceToParse) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is another 0-argument ctor in TestDocumentParserContext that seems unused, maybe we should delete it or otherwise also use MappingLookup.EMPTY as first argument there to avoid suprises later.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch - that's actually causing some percolator failures, will switch it to always use MappingLookup.EMPTY

if (fieldType != null) {
RuntimeField runtimeField = context.root().getRuntimeField(fieldPath);
if (runtimeField != null) {
assert fieldType.hasDocValues() == false && fieldType.isAggregatable() && fieldType.isSearchable();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if its useful to keep this assertion, no strong opinion, just curious if you deleted it for some specific reason.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it was an extra check that this was definitely a runtime field. I don't think it's necessary any longer because we now know where the shadowing logic is pulling fields from.

@romseygeek romseygeek added the auto-backport Automatically create backport pull requests when merged label Jul 22, 2021
@romseygeek romseygeek merged commit ffcaffc into elastic:master Jul 22, 2021
@romseygeek romseygeek deleted the mapper/parser-shadowing branch July 22, 2021 12:15
@elasticsearchmachine
Copy link
Collaborator

💔 Backport failed

Status Branch Result
7.x Commit could not be cherrypicked due to conflicts

To backport manually run:
backport --pr 75595

romseygeek added a commit that referenced this pull request Jul 22, 2021
In #75454 we changed our dynamic shadowing logic to check that an unmapped
field was truly shadowed by a runtime field before returning no-op mappers. However,
this does not handle the case where the runtime field can have multiple subfields, as
will be true for the upcoming composite field type. We instead need to check that
the field in question would not be shadowed by any field type returned by any
runtime field.

This commit abstracts this logic into a new isShadowed() method on
DocumentParserContext, which uses a set of runtime field type names built from
the mapping lookup at construction time. It also simplifies the no-op mapper
slightly by making it a singleton object, as we don't need to preserve field names
here.
ywangd pushed a commit to ywangd/elasticsearch that referenced this pull request Jul 30, 2021
In elastic#75454 we changed our dynamic shadowing logic to check that an unmapped
field was truly shadowed by a runtime field before returning no-op mappers. However,
this does not handle the case where the runtime field can have multiple subfields, as
will be true for the upcoming composite field type. We instead need to check that
the field in question would not be shadowed by any field type returned by any
runtime field.

This commit abstracts this logic into a new isShadowed() method on
DocumentParserContext, which uses a set of runtime field type names built from
the mapping lookup at construction time. It also simplifies the no-op mapper
slightly by making it a singleton object, as we don't need to preserve field names
here.
romseygeek added a commit that referenced this pull request Aug 3, 2021
#75595 added better checks for fields shadowed by runtime fields, so that we
don't index data that would never be searched. However, the shadow lookup
was being rebuilt for every document, which has caused a noticeable regression
in indexing times. This commit reworks things so that this lookup is built once
per mapping update and lives on MappingLookup.
javanna added a commit that referenced this pull request Mar 29, 2022
With #75454 we have fixed a buggy behaviour that was causing us to treat multi-fields as if they were runtime fields when retrieving mappers for the fields encountered while parsing incoming documents. As a follow-up #75595 introduced MappingLookup#isShadowed as a primitive to check if a field is a runtime field which covers also for subfields of the runtime composite field which the previous check did not.

With recent changes around expanding dots in field names, the situation where we previously ended up confusing a multi-field for a runtime field no longer manifests.This allows us to potentially remove isShadowed and go back to the original logic + assertion: if we could not resolve the field to a mapper by name, and we can resolve it to a field type by name, such field must be a runtime field. This is maybe not as elegant as isShadowed but I found its naming a bit confusing and I was hoping we can avoid maintaining a specialized method around this.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto-backport Automatically create backport pull requests when merged >enhancement :Search Foundations/Mapping Index mappings, including merging and defining field types Team:Search Meta label for search team v7.15.0 v8.0.0-alpha1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants