Group field-caps node requests by index mapping hash #84598

dnhatn · 2022-03-03T04:24:17Z

This optimization is for field-caps requests targeting many indices with an index pattern. Instead of reaching out to many data nodes to retrieve field-caps, we can group indices by their mapping hashes then send a single node request with representative indices instead. This optimization is significant in large clusters.

elasticmachine · 2022-03-18T14:14:06Z

Pinging @elastic/es-search (Team:Search)

elasticsearchmachine · 2022-03-18T14:14:26Z

Hi @dnhatn, I've created a changelog YAML for you.

javanna · 2022-03-21T12:59:06Z

Do I understand correctly that before we would ask for field_caps for all indices, and eventually get back one set of fields per distinct mapping hash; with this change, we send one single request per mapping hash, and then apply the same response to all indices that have the same hash? This has the advantage of minimizing the amount of roundtrips when many indices have the same mappings: many indices all with the same mappings would likely send one request per data node, and get back the full set of fields once from each node. This is much better than what we did before as we would otherwise repeat the set of fields for every index, resulting in much bigger transport response. Though with this change, we would ask only one node, and get back the full set of fields once from it, so the amount of requests is no longer a function of the number of data nodes (again provided that mappings are the same for all indices involved).

This change is a great improvement, but it would be even better to be able to measure the improvement through benchmarks, relates to #84504 .

@original-brownbear would you mind having a look too given your involvement with the many shards scalability project?

jtibshirani

This looks like a great optimization! I left a few small comments but the logic + tests look good to me.

I wonder if Kibana is using index_filter or not, this will really affect when the optimization helps! Unfortunately I think it would require a big change to how we execute field caps in order to get it to work with index_filter...

server/src/main/java/org/elasticsearch/action/fieldcaps/RequestDispatcher.java

original-brownbear · 2022-04-28T08:17:31Z

Sorry for the delay here Nhat! I'm looking at this today and will try to at least manually benchmark this a little.

original-brownbear · 2022-04-28T10:26:34Z

@dnhatn I benchmarked this against the many shards benchmark setup now. Interestingly it does not provide any (measurable) throughput improvement. My best guess as to why that is, is that we are bottle-necked on the REST side of the network layer in some form (this isn't something that can be fixed here so I wouldn't worry about this).
It does however look like this change measurably reduces the GC pressure (about half as much CPU goes into GC after these changes) and while I was not able to measure the effects of this on the amount of bytes+messages that the transport layer has to handle (not because they're not there, just a matter of the complexity of setting up the experiment), reducing them is always a win in terms of stability.

The before and after of innerMerge (which is effectively all the CPU load on the coordinating node during field_caps) also looks nicer now and is able to compile the capturing lambda key -> new FieldCapabilities.Builder(field, key) better which saves us some CPU on the coordinator.

before:

after:

-> LGTM from my end though I agree with Julie's points on documentation :)

original-brownbear

see above

dnhatn · 2022-05-02T12:54:40Z

@original-brownbear Thank you for running the benchmark. Did you run against all indices or only auditbeat* indices? I see a small improvement (7%) with auditbeat* indices only.

| 50th percentile latency |              field-caps |   4576.27        |   4254.51        |   -321.753   |     ms |    -7.03% |
| 90th percentile latency |              field-caps |   5482.8         |   4805.08        |   -677.729   |     ms |   -12.36% |
| 99th percentile latency |              field-caps |   6001.41        |   5248.21        |   -753.198   |     ms |   -12.55% |
|100th percentile latency |              field-caps |   6241.83        |   5353.3         |   -888.527   |     ms |   -14.24% |

dnhatn · 2022-05-02T19:22:30Z

@jtibshirani @original-brownbear Thank you for your reviews. I think I have addressed your comments.

I wonder if Kibana is using index_filter or not, this will really affect when the optimization helps! Unfortunately I think it would require a big change to how we execute field caps in order to get it to work with index_filter...

One of the options is to add the can_match phase for requests with index_filter to eliminate unmatched copies as we discussed. We should have a single execution mode in RequestDispatcher after that. I can do this in a follow-up.

dnhatn · 2022-07-06T18:23:10Z

The optimization introduced in this PR doesn't reduce memory usage or latency of field-caps requests. I will close this PR and try to get #86323 in instead. Thanks everyone for reviewing.

dnhatn added the v8.2.0 label Mar 3, 2022

dnhatn force-pushed the field_caps_groups branch from cb0b7ec to 62e964f Compare March 17, 2022 02:07

Group field-caps node requests by index mapping hash

d0295d6

dnhatn force-pushed the field_caps_groups branch from 712f21e to d0295d6 Compare March 17, 2022 02:59

dnhatn added 4 commits March 17, 2022 22:00

Fix tests

14f1853

Fix more tests

63f270c

wording

4e31926

Merge remote-tracking branch 'elastic/master' into field_caps_groups

03a54dc

dnhatn added >enhancement :Search/Search Search-related issues that do not fall into other categories labels Mar 18, 2022

dnhatn marked this pull request as ready for review March 18, 2022 14:14

elasticmachine added the Team:Search Meta label for search team label Mar 18, 2022

Update docs/changelog/84598.yaml

f6d9125

dnhatn requested review from javanna and jtibshirani March 18, 2022 14:37

original-brownbear requested review from original-brownbear and removed request for javanna March 23, 2022 09:24

salvatore-campagna added v8.3.0 and removed v8.2.0 labels Mar 30, 2022

Merge remote-tracking branch 'elastic/master' into field_caps_groups

05ce977

jtibshirani reviewed Apr 15, 2022

View reviewed changes

dnhatn added 6 commits April 27, 2022 15:58

Merge remote-tracking branch 'elastic/master' into field_caps_groups

ccf1dce

HashMap

71acd4c

Use shorted version

96210f7

Do not send requests for empty indices

68d6ddd

javadocs

886893e

add mapping

6a0d42a

original-brownbear reviewed Apr 28, 2022

View reviewed changes

dnhatn added 6 commits April 28, 2022 14:14

HashSet

876ba06

Merge remote-tracking branch 'elastic/master' into field_caps_groups

95c0ad1

Fix ResponseRewriter

37d31b7

Simplify

977846e

shared response

f30ace6

Add TODO

2d1fc88

shared response

13f4fe0

dnhatn mentioned this pull request May 2, 2022

Bulk merge field-caps responses using mapping hash #86323

Merged

dnhatn requested a review from jtibshirani May 2, 2022 19:22

craigtaverner added v8.4.0 and removed v8.3.0 labels May 25, 2022

dnhatn closed this Jul 6, 2022

dnhatn removed the v8.4.0 label Jul 6, 2022

dnhatn deleted the field_caps_groups branch July 6, 2022 18:23

dnhatn removed the request for review from jtibshirani July 6, 2022 18:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Group field-caps node requests by index mapping hash #84598

Group field-caps node requests by index mapping hash #84598

dnhatn commented Mar 3, 2022 •

edited

Loading

elasticmachine commented Mar 18, 2022

elasticsearchmachine commented Mar 18, 2022

javanna commented Mar 21, 2022

jtibshirani left a comment

original-brownbear commented Apr 28, 2022

original-brownbear commented Apr 28, 2022

original-brownbear left a comment

dnhatn commented May 2, 2022

dnhatn commented May 2, 2022

dnhatn commented Jul 6, 2022

Group field-caps node requests by index mapping hash #84598

Group field-caps node requests by index mapping hash #84598

Conversation

dnhatn commented Mar 3, 2022 • edited Loading

elasticmachine commented Mar 18, 2022

elasticsearchmachine commented Mar 18, 2022

javanna commented Mar 21, 2022

jtibshirani left a comment

Choose a reason for hiding this comment

original-brownbear commented Apr 28, 2022

original-brownbear commented Apr 28, 2022

original-brownbear left a comment

Choose a reason for hiding this comment

dnhatn commented May 2, 2022

dnhatn commented May 2, 2022

dnhatn commented Jul 6, 2022

dnhatn commented Mar 3, 2022 •

edited

Loading