Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify SourceLookup sharing across fetch subphases. #60179

Merged
merged 15 commits into from
Jul 30, 2020

Conversation

jtibshirani
Copy link
Contributor

@jtibshirani jtibshirani commented Jul 27, 2020

The SourceLookup class provides access to the _source for a particular
document, specified through SourceLookup#setSegmentAndDocument. Previously
the search context contained a single SourceLookup that was shared between
different fetch subphases. It was hard to reason about its state: is
SourceLookup set to the expected document? Is the _source already loaded and
available?

Instead of using a global source lookup, the fetch hit context now provides
access to a lookup that is set to load from the hit document.

This refactor closes #31000, since the same SourceLookup is no longer shared
between the 'fetch _source phase' and script execution.

@jtibshirani jtibshirani force-pushed the source-lookup branch 3 times, most recently from 58a7d72 to aca5233 Compare July 27, 2020 02:04
@jtibshirani jtibshirani added :Search/Search Search-related issues that do not fall into other categories >refactoring v7.10.0 v8.0.0 labels Jul 27, 2020
@jtibshirani jtibshirani marked this pull request as ready for review July 27, 2020 16:20
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search (:Search/Search)

@elasticmachine elasticmachine added the Team:Search Meta label for search team label Jul 27, 2020
@@ -65,10 +65,6 @@ public void hitsExecute(SearchContext context, SearchHit[] hits) throws IOExcept
}
innerHits.docIdsToLoad(docIdsToLoad, 0, docIdsToLoad.length);
innerHits.setId(hit.getId());
innerHits.lookup().source().setSource(context.lookup().source().internalSourceRef());
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We set the root document's _source here, but when processing inner hits we didn't use it and instead reloaded the _source. So this is safe to remove for now.

In a follow-up PR, I will fix this to make sure the _source is loaded only once for the root doc + inner hits.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, that's something that we spotted sometime ago: #32818 (comment) and maybe #56210 but never implemented.

@mayya-sharipova
Copy link
Contributor

@jtibshirani Thanks for the PR, I like the idea. I had an initial look, and will look more thoroughly tomorrow.

@jtibshirani
Copy link
Contributor Author

jtibshirani commented Jul 28, 2020

I caught up with @nik9000 offline about the refactor and wanted to add more context as to its benefits:

  • The _source loading semantics are clearer. We attempt to load the _source once at the start of the fetch phase, and explicitly add it to the HitContext to make it available for the subphases. Before we manipulated global state (setting the _source on SearchContext), and it was not obvious what the intention was.
  • Any subphase that overrides hitExecute (singular) will have access to the hit context, which contains a reference to the _source for the current document. The subphases that override hitsExecute (plural) do not have access to a hit context and need to reload _source. Previously this difference in behavior was unclear.

However I now see a downside to this refactor. There used to be only one way to share a SourceLookup instance (as part of SearchLookup on QueryShardContext), but now we've added another specific to the fetch phase. This means that a subphase that overwrites hitExecute and also uses a script will not share the loaded _source.

I'm curious as to what others think. I still like the refactor and feel that it moves us in a better direction.

@jimczi
Copy link
Contributor

jimczi commented Jul 28, 2020

However I now see a downside to this refactor. There used to be only one way to share a SourceLookup instance (as part of SearchLookup on QueryShardContext), but now we've added another specific to the fetch phase. This means that a subphase that overwrites hitExecute and also uses a script will not share the loaded _source.

We don't have such example at the moment, right ? That's also something we can tackle in a follow up, currently we use the lookup of the query shard context to compile the scripted field but we could refactor this part to move to the new hit context ? Hits are now sorted by doc ids in fetch sub phases so it's also a good chance to get ride of the hitsExecute completely. That would allow to load the _source only once since today sub phases that use hitsExecute have no way to reuse _source ?

Copy link
Contributor

@jimczi jimczi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this change. I left one comment regarding a possible hack to ensure that scripts and fetch sub phase can still share the _source but I don't feel strongly about it. We can solve the double load more nicely in a follow up if needed.

SourceLookup sourceLookup = context.lookup().source();
sourceLookup.setSegmentAndDocument(subReaderContext, subDocId);
if (fieldsVisitor.source() != null) {
sourceLookup.setSource(fieldsVisitor.source());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we can still assign the context.lookup().source() with the current _source so that _script can share the source with other fetch sub-phase. That would only work for sub-phases that use a script and override hitExecute so the new FetchFieldsPhase would benefit from this hack ? We can then think how we could clenup the context for script in a follow up ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think FetchFieldsPhase would benefit currently, because it doesn't support scripts? I like the plan of considering it in a follow-up, for example as part of converting ScriptFieldsPhase to use hitExecute.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I was thinking preemptively of runtime scripted field but +1 to consider this in a follow up.

@@ -65,10 +65,6 @@ public void hitsExecute(SearchContext context, SearchHit[] hits) throws IOExcept
}
innerHits.docIdsToLoad(docIdsToLoad, 0, docIdsToLoad.length);
innerHits.setId(hit.getId());
innerHits.lookup().source().setSource(context.lookup().source().internalSourceRef());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, that's something that we spotted sometime ago: #32818 (comment) and maybe #56210 but never implemented.

@jtibshirani
Copy link
Contributor Author

In addition to restoring the shared SourceLookup on SearchLookup, I ended up removing all changes to the SearchLookup class, as they weren't core to this PR. So now the PR is scoped just to 'fetch' and avoids @nik9000 and @mayya-sharipova's concern around scripts sharing _source in the query phase.

Here is the final summary of changes:

  • Store SourceLookup on HitContext and set the source when creating hits.
  • Fetch subphases access source through HitContext when possible.
  • Stop saving parent _source before loading inner hits (it was unused).
  • Remove SearchContext#lookup method.

Copy link
Contributor

@mayya-sharipova mayya-sharipova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jtibshirani Thanks Julie, looks a very good change to me!

Copy link
Member

@nik9000 nik9000 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM2

@jtibshirani
Copy link
Contributor Author

Thanks everyone, the review feedback was really helpful.

@jtibshirani jtibshirani merged commit 9d28328 into elastic:master Jul 30, 2020
@jtibshirani jtibshirani deleted the source-lookup branch July 30, 2020 18:57
jtibshirani added a commit that referenced this pull request Dec 9, 2020
This PR fixes a regression where fvh fragments could be loaded from the wrong
document _source.

Some `FragmentsBuilder` implementations contain a `SourceLookup` to load from
_source. The lookup should be positioned to load from the current hit document.
However, since `FragmentsBuilder` are cached and shared across hits, the lookup
is never updated to load from the new documents. This means we accidentally
load _source from a different document.

The regression was introduced in #60179, which started storing `SourceLookup`
on `FragmentsBuilder`.

Fixes #65533.
jtibshirani added a commit to jtibshirani/elasticsearch that referenced this pull request Dec 9, 2020
…5641)

This PR fixes a regression where fvh fragments could be loaded from the wrong
document _source.

Some `FragmentsBuilder` implementations contain a `SourceLookup` to load from
_source. The lookup should be positioned to load from the current hit document.
However, since `FragmentsBuilder` are cached and shared across hits, the lookup
is never updated to load from the new documents. This means we accidentally
load _source from a different document.

The regression was introduced in elastic#60179, which started storing `SourceLookup`
on `FragmentsBuilder`.

Fixes elastic#65533.
jtibshirani added a commit that referenced this pull request Dec 10, 2020
This PR fixes a regression where fvh fragments could be loaded from the wrong
document _source.

Some `FragmentsBuilder` implementations contain a `SourceLookup` to load from
_source. The lookup should be positioned to load from the current hit document.
However, since `FragmentsBuilder` are cached and shared across hits, the lookup
is never updated to load from the new documents. This means we accidentally
load _source from a different document.

The regression was introduced in #60179, which started storing `SourceLookup`
on `FragmentsBuilder`.

Fixes #65533.
jtibshirani added a commit to jtibshirani/elasticsearch that referenced this pull request Dec 10, 2020
…5641)

This PR fixes a regression where fvh fragments could be loaded from the wrong
document _source.

Some `FragmentsBuilder` implementations contain a `SourceLookup` to load from
_source. The lookup should be positioned to load from the current hit document.
However, since `FragmentsBuilder` are cached and shared across hits, the lookup
is never updated to load from the new documents. This means we accidentally
load _source from a different document.

The regression was introduced in elastic#60179, which started storing `SourceLookup`
on `FragmentsBuilder`.

Fixes elastic#65533.
jtibshirani added a commit that referenced this pull request Dec 10, 2020
This PR fixes a regression where fvh fragments could be loaded from the wrong
document _source.

Some `FragmentsBuilder` implementations contain a `SourceLookup` to load from
_source. The lookup should be positioned to load from the current hit document.
However, since `FragmentsBuilder` are cached and shared across hits, the lookup
is never updated to load from the new documents. This means we accidentally
load _source from a different document.

The regression was introduced in #60179, which started storing `SourceLookup`
on `FragmentsBuilder`.

Fixes #65533.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug >refactoring :Search/Search Search-related issues that do not fall into other categories Team:Search Meta label for search team v7.10.0 v8.0.0-alpha1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

_source not returned in "explain": true with certain queries
6 participants