Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Apply workaround for synthetic source of object arrays inside nested objects #115275

Merged
merged 5 commits into from
Oct 23, 2024

Conversation

lkts
Copy link
Contributor

@lkts lkts commented Oct 21, 2024

This PR applies workaround for #115261.

@lkts lkts added >non-issue auto-backport Automatically create backport pull requests when merged :StorageEngine/Logs You know, for Logs v8.17.0 labels Oct 21, 2024
@lkts lkts requested review from martijnvg and kkrik-es October 21, 2024 18:48
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-storage-engine (Team:StorageEngine)

// Synthetic source is generated by a unique synthetic field loader for
// every nested document meaning that there are no conflicts between different objects in an array
// that are possible for plain objects and that need to be handles by a second parsing pass.
// It is however possible that inside this nested object we'll eventually be in the array scope.
cloned.inArrayScope = false;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously we completely disabled second parsing pass logic for nested objects. However since NameValue contains current lucene doc it actually works as expected and ignored values are added to correct document.

@lkts
Copy link
Contributor Author

lkts commented Oct 21, 2024

Okay, there is a case when this does not work.

@kkrik-es
Copy link
Contributor

I was wondering if this commit is related: e7108ae

@lkts
Copy link
Contributor Author

lkts commented Oct 21, 2024

So the problem is (kind of) here

Even though we have a correct NameValue entry for every nested document we will overwrite previous entries and dump everything into the last document because we don't know how to route parsed data to a correct doc.

@kkrik-es
Copy link
Contributor

Okay, there is a case when this does not work.

I mean, define "not work".. It's not perfect, but ordering and duplicates are preserved?

@lkts
Copy link
Contributor Author

lkts commented Oct 21, 2024

If we put "void" in all nested documents except last one it will kind of work :D

@lkts lkts force-pushed the nested_objects_ignored_source_fix branch from 7343e2a to 7854c52 Compare October 21, 2024 21:03
@lkts lkts changed the title Relax handling of object arrays inside nested objects for synthetic source Apply workaround for synthetic source of object arrays inside nested objects Oct 21, 2024
@lkts
Copy link
Contributor Author

lkts commented Oct 21, 2024

Since my original fix did not work, i have changed this PR to apply a workaround of always storing object arrays inside nested objects.

objectWithFallbackSyntheticSource = mode == Mapper.SourceKeepMode.ALL
|| (mode == Mapper.SourceKeepMode.ARRAYS && objectMapper instanceof NestedObjectMapper == false)
// Inside nested objects we always store object arrays as a workaround for #115261.
|| context.inNestedScope();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's more nuanced.. the check above bypasses storing array source for nested objects, as they exist for this exact reason.

Can we detect that there are deeply nested arrays and store source only for them? Maybe
|| (context.inNestedScope() && objectMapper instanceof NestedObjectMapper == false))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That should work!

@lkts lkts requested a review from a team as a code owner October 22, 2024 21:26
@lkts
Copy link
Contributor Author

lkts commented Oct 22, 2024

@elasticmachine update branch

Copy link
Member

@martijnvg martijnvg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lkts lkts merged commit f04bf5c into elastic:main Oct 23, 2024
16 checks passed
@lkts lkts deleted the nested_objects_ignored_source_fix branch October 23, 2024 20:22
@elasticsearchmachine
Copy link
Collaborator

💔 Backport failed

Status Branch Result
8.x Commit could not be cherrypicked due to conflicts

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 115275

lkts added a commit to lkts/elasticsearch that referenced this pull request Oct 23, 2024
… objects (elastic#115275)

(cherry picked from commit f04bf5c)

# Conflicts:
#	rest-api-spec/build.gradle
#	rest-api-spec/src/yamlRestTest/resources/rest-api-spec/test/indices.create/21_synthetic_source_stored.yml
@lkts
Copy link
Contributor Author

lkts commented Oct 23, 2024

💚 All backports created successfully

Status Branch Result
8.x

Questions ?

Please refer to the Backport tool documentation

lkts added a commit that referenced this pull request Oct 24, 2024
…ested objects (#115275) (#115467)

* Apply workaround for synthetic source of object arrays inside nested objects (#115275)

(cherry picked from commit f04bf5c)

# Conflicts:
#	rest-api-spec/build.gradle
#	rest-api-spec/src/yamlRestTest/resources/rest-api-spec/test/indices.create/21_synthetic_source_stored.yml

* Fix merge
davidkyle pushed a commit to davidkyle/elasticsearch that referenced this pull request Oct 24, 2024
georgewallace pushed a commit to georgewallace/elasticsearch that referenced this pull request Oct 25, 2024
jfreden pushed a commit to jfreden/elasticsearch that referenced this pull request Nov 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto-backport Automatically create backport pull requests when merged backport pending >non-issue :StorageEngine/Logs You know, for Logs Team:StorageEngine v8.17.0 v9.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants