Improve FieldFetcher retrieval of fields #66160

cbuescher · 2020-12-10T11:47:40Z

Currently FieldFetcher stores all the FieldContexts that are later used to
retrieve the fields in a List. This has the disadvantage that the same field
path can be retrieved several times (e.g. if multiple patterns match the same
path or if similar paths are defined several times e.g. with different formats).
Currently the last value to be retrieved "wins" and gets returned. We might as
well de-duplicate the FieldContexts by using a map internally, keyed by the
field path that is going to be retrieved, to avoid more work later.

Currently FieldFetcher stores all the FieldContexts that are later used to retrieve the fields in a List. This has the disadvantage that the same field path can be retrieved several times (e.g. if multiple patterns match the same path or if similar paths are defined several times e.g. with different formats). Currently the last value to be retrieved "wins" and gets returned. We might as well de-duplicate the FieldContexts by using a map internally, keyed by the field path that is going to be retrieved, to avoid more work later.

elasticmachine · 2020-12-10T11:47:43Z

Pinging @elastic/es-search (Team:Search)

jtibshirani

Thanks, this is a nice simplification.

jtibshirani · 2020-12-10T19:52:22Z

server/src/main/java/org/elasticsearch/search/fetch/subphase/FieldFetcher.java

@@ -48,7 +48,7 @@ public static FieldFetcher create(QueryShardContext context,
                                      SearchLookup searchLookup,
                                      Collection<FieldAndFormat> fieldAndFormats) {

-        List<FieldContext> fieldContexts = new ArrayList<>();
+        Map<String, FieldContext> fieldContexts = new HashMap<>();


I think it'd make sense to use a LinkedHashMap here -- then the fields would come back in the order requested. For example if you passed "fields": [ "name", "title"], then it'd be nice to return

"fields": { "name": [ "christoph"], "title": [ "engineer" ] }

We wouldn't make a formal guarantee about order (since it's a JSON map), I think it helps readability for debugging, etc.

jtibshirani · 2020-12-10T19:52:38Z

server/src/main/java/org/elasticsearch/search/fetch/subphase/FieldFetcher.java

@@ -48,7 +48,7 @@ public static FieldFetcher create(QueryShardContext context,
                                      SearchLookup searchLookup,
                                      Collection<FieldAndFormat> fieldAndFormats) {

-        List<FieldContext> fieldContexts = new ArrayList<>();
+        Map<String, FieldContext> fieldContexts = new HashMap<>();
        List<String> unmappedFetchPattern = new ArrayList<>();
        Set<String> mappedToExclude = new HashSet<>();


Could we remove this set, and instead rely on fieldContexts.containsKey(...)?

Sure, great unintended side effect

cbuescher · 2020-12-14T11:26:37Z

@jtibshirani thanks for the review, I pushed the changes you requested.

cbuescher · 2020-12-14T11:55:23Z

@elasticmachine update branch

cbuescher · 2020-12-14T13:11:01Z

@elasticmachine run elasticsearch-ci/1
@elasticmachine run elasticsearch-ci/eql-correctness

Currently FieldFetcher stores all the FieldContexts that are later used to retrieve the fields in a List. This has the disadvantage that the same field path can be retrieved several times (e.g. if multiple patterns match the same path or if similar paths are defined several times e.g. with different formats). Currently the last value to be retrieved "wins" and gets returned. We might as well de-duplicate the FieldContexts by using a map internally, keyed by the field path that is going to be retrieved, to avoid more work later.

* elastic/master: (33 commits) Add searchable snapshot cache folder to NodeEnvironment (elastic#66297) [DOCS] Add dynamic runtime fields to docs (elastic#66194) Add HDFS searchable snapshot integration (elastic#66185) Support canceling cross-clusters search requests (elastic#66206) Mute testCacheSurviveRestart (elastic#66289) Fix cat tasks api params in spec and handler (elastic#66272) Snapshot of a searchable snapshot should be empty (elastic#66162) [ML] DFA _explain API should not fail when none field is included (elastic#66281) Add action to decommission legacy monitoring cluster alerts (elastic#64373) move rollup_index param out of RollupActionConfig (elastic#66139) Improve FieldFetcher retrieval of fields (elastic#66160) Remove unsed fields in `RestAnalyzeAction` (elastic#66215) Simplify searchable snapshot CacheKey (elastic#66263) Autoscaling remove feature flags (elastic#65973) Improve searchable snapshot mount time (elastic#66198) [ML] Report cause when datafeed extraction encounters error (elastic#66167) Remove suggest reference in some API specs (elastic#66180) Fix warning when installing a plugin for different ESversion (elastic#66146) [ML] make `xpack.ml.max_ml_node_size` and `xpack.ml.use_auto_machine_memory_percent` dynamically settable (elastic#66132) [DOCS] Add `require_alias` to Bulk API (elastic#66259) ...

cbuescher added >enhancement :Search Foundations/Mapping Index mappings, including merging and defining field types v8.0.0 v7.11.0 v7.12.0 labels Dec 10, 2020

cbuescher requested a review from jtibshirani December 10, 2020 11:47

elasticmachine added the Team:Search Meta label for search team label Dec 10, 2020

jtibshirani reviewed Dec 10, 2020

View reviewed changes

iter

fb1bd84

Merge branch 'master' into ff-list-to-set

afea33e

jtibshirani approved these changes Dec 14, 2020

View reviewed changes

cbuescher merged commit 852f6a4 into elastic:master Dec 14, 2020

cbuescher added the backport pending label Dec 14, 2020

cbuescher mentioned this pull request Dec 14, 2020

Improve FieldFetcher retrieval of fields (#66160) #66284

Merged

cbuescher removed the backport pending label Dec 14, 2020

jakelandis added v8.0.0-alpha1 and removed v8.0.0 labels Jul 26, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve FieldFetcher retrieval of fields #66160

Improve FieldFetcher retrieval of fields #66160

cbuescher commented Dec 10, 2020

elasticmachine commented Dec 10, 2020

jtibshirani left a comment

jtibshirani Dec 10, 2020

jtibshirani Dec 10, 2020

cbuescher Dec 14, 2020

cbuescher commented Dec 14, 2020

cbuescher commented Dec 14, 2020

cbuescher commented Dec 14, 2020

Improve FieldFetcher retrieval of fields #66160

Improve FieldFetcher retrieval of fields #66160

Conversation

cbuescher commented Dec 10, 2020

elasticmachine commented Dec 10, 2020

jtibshirani left a comment

Choose a reason for hiding this comment

jtibshirani Dec 10, 2020

Choose a reason for hiding this comment

jtibshirani Dec 10, 2020

Choose a reason for hiding this comment

cbuescher Dec 14, 2020

Choose a reason for hiding this comment

cbuescher commented Dec 14, 2020

cbuescher commented Dec 14, 2020

cbuescher commented Dec 14, 2020