Avoid doing I/O when fetching min and max for keyword fields #92026

javanna · 2022-11-30T15:10:17Z

Whenever sorting on a date, numeric or keyword field (as primary sort), the can_match phase retrieves min and max for the field and sorts the shards (asc or desc depending on the sort order) so that they are going to be queried following that order. This allows incremental results to be exposed in that same order when using async search, as well as optimizations built on top of such behaviour (#51852).

For fields with points we call getMinPackedValue and getMaxPackedValue, while for keyword fields we call Terms#getMin and Terms#getMax. Elasticsearch uses FilterTerms implementations to cancel queries as well as to track field usage. Such filter implementations should delegate their getMin and getMax calls to the wrapped Terms instance, which will leverage info from the block tree that caches min and max, otherwise they are always going to be retrieved from the index, which does I/O and slows the can_match phase down.

elasticsearchmachine · 2022-11-30T15:10:42Z

Hi @javanna, I've created a changelog YAML for you.

elasticsearchmachine · 2022-11-30T15:10:44Z

Pinging @elastic/es-search (Team:Search)

dnhatn

LGTM. Thanks Luca!

romseygeek

LGTM!

Whenever sorting on a date, numeric or keyword field (as primary sort), the can_match phase retrieves min and max for the field and sorts the shards (asc or desc depending on the sort order) so that they are going to be queried following that order. This allows incremental results to be exposed in that same order when using async search, as well as optimizations built on top of such behaviour (elastic#51852). For fields with points we call `getMinPackedValue` and `getMaxPackedValue`, while for keyword fields we call `Terms#getMin` and `Terms#getMax`. Elasticsearch uses `FilterTerms` implementations to cancel queries as well as to track field usage. Such filter implementations should delegate their `getMin` and `getMax` calls to the wrapped `Terms` instance, which will leverage info from the block tree that caches min and max, otherwise they are always going to be retrieved from the index, which does I/O and slows the can_match phase down.

… and geonames tracks We recently found a regression that affected searches sorted by keyword field (elastic/elasticsearch#92026). Given that we had no benchmarks for sorting by keyword, this commit adds the relevant operations to the http-logs and geonames tracks. Geonames is a good base but it's good to also make the new challenges part of the many-shards benchmarks as differences can be appreciated with a high amount of shards involved in a query. This commit adds also specific challenges to verify the effect of elastic/elasticsearch#51852 when a search is sorted by numeric or timestamp.

javanna · 2022-12-20T10:55:04Z

I ran the geonames benchmarks and the improvement is visible. In order to appreciate it though, you need to have more than a couple of shards on the same data node, as can_match executes sequentially hence the total latency is the sum of the latency of all shards.

Baseline with 10 shards:

|                                                 Min Throughput |              sort_country_code |     1.5         |   ops/s |
|                                                Mean Throughput |              sort_country_code |     1.51        |   ops/s |
|                                              Median Throughput |              sort_country_code |     1.51        |   ops/s |
|                                                 Max Throughput |              sort_country_code |     1.51        |   ops/s |
|                                        50th percentile latency |              sort_country_code |     5.46342     |      ms |
|                                        90th percentile latency |              sort_country_code |     5.8815      |      ms |
|                                        99th percentile latency |              sort_country_code |     6.18189     |      ms |
|                                       100th percentile latency |              sort_country_code |     6.23045     |      ms |
|                                   50th percentile service time |              sort_country_code |     4.00137     |      ms |
|                                   90th percentile service time |              sort_country_code |     4.27985     |      ms |
|                                   99th percentile service time |              sort_country_code |     4.41439     |      ms |
|                                  100th percentile service time |              sort_country_code |     4.42995     |      ms |
|                                                     error rate |              sort_country_code |     0           |       % |
|                                                 Min Throughput | sort_country_code_no_can_match |     1.5         |   ops/s |
|                                                Mean Throughput | sort_country_code_no_can_match |     1.51        |   ops/s |
|                                              Median Throughput | sort_country_code_no_can_match |     1.51        |   ops/s |
|                                                 Max Throughput | sort_country_code_no_can_match |     1.51        |   ops/s |
|                                        50th percentile latency | sort_country_code_no_can_match |     3.38948     |      ms |
|                                        90th percentile latency | sort_country_code_no_can_match |     3.848       |      ms |
|                                        99th percentile latency | sort_country_code_no_can_match |     4.36807     |      ms |
|                                       100th percentile latency | sort_country_code_no_can_match |     4.42656     |      ms |
|                                   50th percentile service time | sort_country_code_no_can_match |     1.95878     |      ms |
|                                   90th percentile service time | sort_country_code_no_can_match |     2.19764     |      ms |
|                                   99th percentile service time | sort_country_code_no_can_match |     2.4506      |      ms |
|                                  100th percentile service time | sort_country_code_no_can_match |     2.4578      |      ms |
|                                                     error rate | sort_country_code_no_can_match |     0           |       % |

10 shards - with the fix:

|                                                 Min Throughput |              sort_country_code |     1.5         |   ops/s |
|                                                Mean Throughput |              sort_country_code |     1.51        |   ops/s |
|                                              Median Throughput |              sort_country_code |     1.51        |   ops/s |
|                                                 Max Throughput |              sort_country_code |     1.51        |   ops/s |
|                                        50th percentile latency |              sort_country_code |     3.48115     |      ms |
|                                        90th percentile latency |              sort_country_code |     3.9276      |      ms |
|                                        99th percentile latency |              sort_country_code |     4.9217      |      ms |
|                                       100th percentile latency |              sort_country_code |     5.48023     |      ms |
|                                   50th percentile service time |              sort_country_code |     2.04073     |      ms |
|                                   90th percentile service time |              sort_country_code |     2.30456     |      ms |
|                                   99th percentile service time |              sort_country_code |     3.02021     |      ms |
|                                  100th percentile service time |              sort_country_code |     3.58813     |      ms |
|                                                     error rate |              sort_country_code |     0           |       % |
|                                                 Min Throughput | sort_country_code_no_can_match |     1.5         |   ops/s |
|                                                Mean Throughput | sort_country_code_no_can_match |     1.51        |   ops/s |
|                                              Median Throughput | sort_country_code_no_can_match |     1.51        |   ops/s |
|                                                 Max Throughput | sort_country_code_no_can_match |     1.51        |   ops/s |
|                                        50th percentile latency | sort_country_code_no_can_match |     3.21323     |      ms |
|                                        90th percentile latency | sort_country_code_no_can_match |     3.63292     |      ms |
|                                        99th percentile latency | sort_country_code_no_can_match |     3.99901     |      ms |
|                                       100th percentile latency | sort_country_code_no_can_match |     4.0942      |      ms |
|                                   50th percentile service time | sort_country_code_no_can_match |     1.72001     |      ms |
|                                   90th percentile service time | sort_country_code_no_can_match |     1.97609     |      ms |
|                                   99th percentile service time | sort_country_code_no_can_match |     2.16222     |      ms |
|                                  100th percentile service time | sort_country_code_no_can_match |     2.1899      |      ms |
|                                                     error rate | sort_country_code_no_can_match |     0           |       % |

Baseline with 100 shards:

|                                                 Min Throughput |              sort_country_code |      1.5         |   ops/s |
|                                                Mean Throughput |              sort_country_code |      1.51        |   ops/s |
|                                              Median Throughput |              sort_country_code |      1.51        |   ops/s |
|                                                 Max Throughput |              sort_country_code |      1.51        |   ops/s |
|                                        50th percentile latency |              sort_country_code |     21.4611      |      ms |
|                                        90th percentile latency |              sort_country_code |     22.1005      |      ms |
|                                        99th percentile latency |              sort_country_code |     22.5758      |      ms |
|                                       100th percentile latency |              sort_country_code |     22.6604      |      ms |
|                                   50th percentile service time |              sort_country_code |     20.0432      |      ms |
|                                   90th percentile service time |              sort_country_code |     20.4968      |      ms |
|                                   99th percentile service time |              sort_country_code |     20.9834      |      ms |
|                                  100th percentile service time |              sort_country_code |     21.0219      |      ms |
|                                                     error rate |              sort_country_code |      0           |       % |
|                                                 Min Throughput | sort_country_code_no_can_match |      1.5         |   ops/s |
|                                                Mean Throughput | sort_country_code_no_can_match |      1.51        |   ops/s |
|                                              Median Throughput | sort_country_code_no_can_match |      1.51        |   ops/s |
|                                                 Max Throughput | sort_country_code_no_can_match |      1.51        |   ops/s |
|                                        50th percentile latency | sort_country_code_no_can_match |     10.1881      |      ms |
|                                        90th percentile latency | sort_country_code_no_can_match |     10.7168      |      ms |
|                                        99th percentile latency | sort_country_code_no_can_match |     17.4883      |      ms |
|                                       100th percentile latency | sort_country_code_no_can_match |     23.6664      |      ms |
|                                   50th percentile service time | sort_country_code_no_can_match |      8.68162     |      ms |
|                                   90th percentile service time | sort_country_code_no_can_match |      9.11954     |      ms |
|                                   99th percentile service time | sort_country_code_no_can_match |     15.6349      |      ms |
|                                  100th percentile service time | sort_country_code_no_can_match |     21.7983      |      ms |
|                                                     error rate | sort_country_code_no_can_match |      0           |       % |

100 shards - with the fix:

|                                                 Min Throughput |              sort_country_code |      1.5         |   ops/s |
|                                                Mean Throughput |              sort_country_code |      1.51        |   ops/s |
|                                              Median Throughput |              sort_country_code |      1.51        |   ops/s |
|                                                 Max Throughput |              sort_country_code |      1.51        |   ops/s |
|                                        50th percentile latency |              sort_country_code |     12.3007      |      ms |
|                                        90th percentile latency |              sort_country_code |     12.8042      |      ms |
|                                        99th percentile latency |              sort_country_code |     13.1673      |      ms |
|                                       100th percentile latency |              sort_country_code |     13.1859      |      ms |
|                                   50th percentile service time |              sort_country_code |     10.6005      |      ms |
|                                   90th percentile service time |              sort_country_code |     11.0101      |      ms |
|                                   99th percentile service time |              sort_country_code |     11.4409      |      ms |
|                                  100th percentile service time |              sort_country_code |     11.5415      |      ms |
|                                                     error rate |              sort_country_code |      0           |       % |
|                                                 Min Throughput | sort_country_code_no_can_match |      1.5         |   ops/s |
|                                                Mean Throughput | sort_country_code_no_can_match |      1.51        |   ops/s |
|                                              Median Throughput | sort_country_code_no_can_match |      1.51        |   ops/s |
|                                                 Max Throughput | sort_country_code_no_can_match |      1.51        |   ops/s |
|                                        50th percentile latency | sort_country_code_no_can_match |     10.2294      |      ms |
|                                        90th percentile latency | sort_country_code_no_can_match |     10.7747      |      ms |
|                                        99th percentile latency | sort_country_code_no_can_match |     11.0563      |      ms |
|                                       100th percentile latency | sort_country_code_no_can_match |     11.1577      |      ms |
|                                   50th percentile service time | sort_country_code_no_can_match |      8.549       |      ms |
|                                   90th percentile service time | sort_country_code_no_can_match |      8.84923     |      ms |
|                                   99th percentile service time | sort_country_code_no_can_match |      9.21479     |      ms |
|                                  100th percentile service time | sort_country_code_no_can_match |      9.3753      |      ms |
|                                                     error rate | sort_country_code_no_can_match |      0           |       % |

My conclusion is that the regression is visible the more shards you have on the same data node. There is a slight impact of the regression in the query phase too, but it's not clearly visible because shards are queried in parallel.

I plan on merging this as soon as the nightly benchmarks include the additional challenges that I am adding to track performance of sorting by keyword and numeric.

…#92026) Whenever sorting on a date, numeric or keyword field (as primary sort), the can_match phase retrieves min and max for the field and sorts the shards (asc or desc depending on the sort order) so that they are going to be queried following that order. This allows incremental results to be exposed in that same order when using async search, as well as optimizations built on top of such behaviour (elastic#51852). For fields with points we call `getMinPackedValue` and `getMaxPackedValue`, while for keyword fields we call `Terms#getMin` and `Terms#getMax`. Elasticsearch uses `FilterTerms` implementations to cancel queries as well as to track field usage. Such filter implementations should delegate their `getMin` and `getMax` calls to the wrapped `Terms` instance, which will leverage info from the block tree that caches min and max, otherwise they are always going to be retrieved from the index, which does I/O and slows the can_match phase down.

…#92854) Whenever sorting on a date, numeric or keyword field (as primary sort), the can_match phase retrieves min and max for the field and sorts the shards (asc or desc depending on the sort order) so that they are going to be queried following that order. This allows incremental results to be exposed in that same order when using async search, as well as optimizations built on top of such behaviour (#51852). For fields with points we call `getMinPackedValue` and `getMaxPackedValue`, while for keyword fields we call `Terms#getMin` and `Terms#getMax`. Elasticsearch uses `FilterTerms` implementations to cancel queries as well as to track field usage. Such filter implementations should delegate their `getMin` and `getMax` calls to the wrapped `Terms` instance, which will leverage info from the block tree that caches min and max, otherwise they are always going to be retrieved from the index, which does I/O and slows the can_match phase down.

…#92026) Whenever sorting on a date, numeric or keyword field (as primary sort), the can_match phase retrieves min and max for the field and sorts the shards (asc or desc depending on the sort order) so that they are going to be queried following that order. This allows incremental results to be exposed in that same order when using async search, as well as optimizations built on top of such behaviour (elastic#51852). For fields with points we call `getMinPackedValue` and `getMaxPackedValue`, while for keyword fields we call `Terms#getMin` and `Terms#getMax`. Elasticsearch uses `FilterTerms` implementations to cancel queries as well as to track field usage. Such filter implementations should delegate their `getMin` and `getMax` calls to the wrapped `Terms` instance, which will leverage info from the block tree that caches min and max, otherwise they are always going to be retrieved from the index, which does I/O and slows the can_match phase down.

Whenever sorting on a date, numeric or keyword field (as primary sort), the can_match phase retrieves min and max for the field and sorts the shards (asc or desc depending on the sort order) so that they are going to be queried following that order. This allows incremental results to be exposed in that same order when using async search, as well as optimizations built on top of such behaviour (#51852). For fields with points we call `getMinPackedValue` and `getMaxPackedValue`, while for keyword fields we call `Terms#getMin` and `Terms#getMax`. Elasticsearch uses `FilterTerms` implementations to cancel queries as well as to track field usage. Such filter implementations should delegate their `getMin` and `getMax` calls to the wrapped `Terms` instance, which will leverage info from the block tree that caches min and max, otherwise they are always going to be retrieved from the index, which does I/O and slows the can_match phase down.

… and geonames tracks (#357) We recently found a regression that affected searches sorted by a keyword field (elastic/elasticsearch#92026). Given that we had no benchmarks for sorting by keyword, this commit adds the relevant operations to the http-logs and geonames tracks. We will want to also add similar challenges to the many-shards benchmarks, as the regressions we found can be seen with more than a couple of shards. This commit adds also specific challenges to verify the effect of elastic/elasticsearch#51852 when a search is sorted by numeric or timestamp.

javanna added >bug :Search/Search Search-related issues that do not fall into other categories v8.6.1 v8.7.0 labels Nov 30, 2022

javanna requested a review from dnhatn November 30, 2022 15:10

elasticsearchmachine added the Team:Search Meta label for search team label Nov 30, 2022

dnhatn approved these changes Nov 30, 2022

View reviewed changes

romseygeek approved these changes Nov 30, 2022

View reviewed changes

javanna and others added 2 commits December 1, 2022 21:14

Update docs/changelog/92026.yaml

a4faa46

javanna force-pushed the fix/terms_min_max_io branch from d3dad5b to a4faa46 Compare December 1, 2022 20:16

Merge branch 'main' into fix/terms_min_max_io

ad821ee

javanna mentioned this pull request Dec 19, 2022

Add sort by field with and without can match challenge to http-logs and geonames tracks elastic/rally-tracks#357

Merged

javanna added the auto-backport Automatically create backport pull requests when merged label Jan 11, 2023

javanna merged commit 766e426 into elastic:main Jan 11, 2023

javanna deleted the fix/terms_min_max_io branch January 11, 2023 22:34

javanna mentioned this pull request Jan 11, 2023

[8.6] Avoid doing I/O when fetching min and max for keyword fields (#92026) #92854

Merged

javanna added the v7.17.9 label Jan 12, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Avoid doing I/O when fetching min and max for keyword fields #92026

Avoid doing I/O when fetching min and max for keyword fields #92026

javanna commented Nov 30, 2022

elasticsearchmachine commented Nov 30, 2022

elasticsearchmachine commented Nov 30, 2022

dnhatn left a comment

romseygeek left a comment

javanna commented Dec 20, 2022

Avoid doing I/O when fetching min and max for keyword fields #92026

Avoid doing I/O when fetching min and max for keyword fields #92026

Conversation

javanna commented Nov 30, 2022

elasticsearchmachine commented Nov 30, 2022

elasticsearchmachine commented Nov 30, 2022

dnhatn left a comment

Choose a reason for hiding this comment

romseygeek left a comment

Choose a reason for hiding this comment

javanna commented Dec 20, 2022