Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid doing I/O when fetching min and max for keyword fields #92026

Merged
merged 3 commits into from
Jan 11, 2023

Conversation

javanna
Copy link
Member

@javanna javanna commented Nov 30, 2022

Whenever sorting on a date, numeric or keyword field (as primary sort), the can_match phase retrieves min and max for the field and sorts the shards (asc or desc depending on the sort order) so that they are going to be queried following that order. This allows incremental results to be exposed in that same order when using async search, as well as optimizations built on top of such behaviour (#51852).

For fields with points we call getMinPackedValue and getMaxPackedValue, while for keyword fields we call Terms#getMin and Terms#getMax. Elasticsearch uses FilterTerms implementations to cancel queries as well as to track field usage. Such filter implementations should delegate their getMin and getMax calls to the wrapped Terms instance, which will leverage info from the block tree that caches min and max, otherwise they are always going to be retrieved from the index, which does I/O and slows the can_match phase down.

@javanna javanna added >bug :Search/Search Search-related issues that do not fall into other categories v8.6.1 v8.7.0 labels Nov 30, 2022
@javanna javanna requested a review from dnhatn November 30, 2022 15:10
@elasticsearchmachine elasticsearchmachine added the Team:Search Meta label for search team label Nov 30, 2022
@elasticsearchmachine
Copy link
Collaborator

Hi @javanna, I've created a changelog YAML for you.

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search (Team:Search)

Copy link
Member

@dnhatn dnhatn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks Luca!

Copy link
Contributor

@romseygeek romseygeek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

javanna and others added 2 commits December 1, 2022 21:14
Whenever sorting on a date, numeric or keyword field (as primary sort), the can_match phase retrieves min and max for the field and sorts the shards (asc or desc depending on the sort order) so that they are going to be queried following that order. This allows incremental results to be exposed in that same order when using async search, as well as optimizations built on top of such behaviour (elastic#51852).

For fields with points we call `getMinPackedValue` and `getMaxPackedValue`, while for keyword fields we call `Terms#getMin` and `Terms#getMax`. Elasticsearch uses `FilterTerms` implementations to cancel queries as well as to track field usage. Such filter implementations should delegate their `getMin` and `getMax` calls to the wrapped `Terms` instance, which will leverage info from the block tree that caches min and max, otherwise they are always going to be retrieved from the index, which does I/O and slows the can_match phase down.
@javanna javanna force-pushed the fix/terms_min_max_io branch from d3dad5b to a4faa46 Compare December 1, 2022 20:16
javanna added a commit to javanna/rally-tracks that referenced this pull request Dec 19, 2022
… and geonames tracks

We recently found a regression that affected searches sorted by keyword field (elastic/elasticsearch#92026).

Given that we had no benchmarks for sorting by keyword, this commit adds the relevant operations to the http-logs and geonames tracks.

Geonames is a good base but it's good to also make the new challenges part of the many-shards benchmarks as differences can be appreciated
with a high amount of shards involved in a query. This commit adds also specific challenges to verify the effect of elastic/elasticsearch#51852
when a search is sorted by numeric or timestamp.
@javanna
Copy link
Member Author

javanna commented Dec 20, 2022

I ran the geonames benchmarks and the improvement is visible. In order to appreciate it though, you need to have more than a couple of shards on the same data node, as can_match executes sequentially hence the total latency is the sum of the latency of all shards.

Baseline with 10 shards:

|                                                 Min Throughput |              sort_country_code |     1.5         |   ops/s |
|                                                Mean Throughput |              sort_country_code |     1.51        |   ops/s |
|                                              Median Throughput |              sort_country_code |     1.51        |   ops/s |
|                                                 Max Throughput |              sort_country_code |     1.51        |   ops/s |
|                                        50th percentile latency |              sort_country_code |     5.46342     |      ms |
|                                        90th percentile latency |              sort_country_code |     5.8815      |      ms |
|                                        99th percentile latency |              sort_country_code |     6.18189     |      ms |
|                                       100th percentile latency |              sort_country_code |     6.23045     |      ms |
|                                   50th percentile service time |              sort_country_code |     4.00137     |      ms |
|                                   90th percentile service time |              sort_country_code |     4.27985     |      ms |
|                                   99th percentile service time |              sort_country_code |     4.41439     |      ms |
|                                  100th percentile service time |              sort_country_code |     4.42995     |      ms |
|                                                     error rate |              sort_country_code |     0           |       % |
|                                                 Min Throughput | sort_country_code_no_can_match |     1.5         |   ops/s |
|                                                Mean Throughput | sort_country_code_no_can_match |     1.51        |   ops/s |
|                                              Median Throughput | sort_country_code_no_can_match |     1.51        |   ops/s |
|                                                 Max Throughput | sort_country_code_no_can_match |     1.51        |   ops/s |
|                                        50th percentile latency | sort_country_code_no_can_match |     3.38948     |      ms |
|                                        90th percentile latency | sort_country_code_no_can_match |     3.848       |      ms |
|                                        99th percentile latency | sort_country_code_no_can_match |     4.36807     |      ms |
|                                       100th percentile latency | sort_country_code_no_can_match |     4.42656     |      ms |
|                                   50th percentile service time | sort_country_code_no_can_match |     1.95878     |      ms |
|                                   90th percentile service time | sort_country_code_no_can_match |     2.19764     |      ms |
|                                   99th percentile service time | sort_country_code_no_can_match |     2.4506      |      ms |
|                                  100th percentile service time | sort_country_code_no_can_match |     2.4578      |      ms |
|                                                     error rate | sort_country_code_no_can_match |     0           |       % |

10 shards - with the fix:

|                                                 Min Throughput |              sort_country_code |     1.5         |   ops/s |
|                                                Mean Throughput |              sort_country_code |     1.51        |   ops/s |
|                                              Median Throughput |              sort_country_code |     1.51        |   ops/s |
|                                                 Max Throughput |              sort_country_code |     1.51        |   ops/s |
|                                        50th percentile latency |              sort_country_code |     3.48115     |      ms |
|                                        90th percentile latency |              sort_country_code |     3.9276      |      ms |
|                                        99th percentile latency |              sort_country_code |     4.9217      |      ms |
|                                       100th percentile latency |              sort_country_code |     5.48023     |      ms |
|                                   50th percentile service time |              sort_country_code |     2.04073     |      ms |
|                                   90th percentile service time |              sort_country_code |     2.30456     |      ms |
|                                   99th percentile service time |              sort_country_code |     3.02021     |      ms |
|                                  100th percentile service time |              sort_country_code |     3.58813     |      ms |
|                                                     error rate |              sort_country_code |     0           |       % |
|                                                 Min Throughput | sort_country_code_no_can_match |     1.5         |   ops/s |
|                                                Mean Throughput | sort_country_code_no_can_match |     1.51        |   ops/s |
|                                              Median Throughput | sort_country_code_no_can_match |     1.51        |   ops/s |
|                                                 Max Throughput | sort_country_code_no_can_match |     1.51        |   ops/s |
|                                        50th percentile latency | sort_country_code_no_can_match |     3.21323     |      ms |
|                                        90th percentile latency | sort_country_code_no_can_match |     3.63292     |      ms |
|                                        99th percentile latency | sort_country_code_no_can_match |     3.99901     |      ms |
|                                       100th percentile latency | sort_country_code_no_can_match |     4.0942      |      ms |
|                                   50th percentile service time | sort_country_code_no_can_match |     1.72001     |      ms |
|                                   90th percentile service time | sort_country_code_no_can_match |     1.97609     |      ms |
|                                   99th percentile service time | sort_country_code_no_can_match |     2.16222     |      ms |
|                                  100th percentile service time | sort_country_code_no_can_match |     2.1899      |      ms |
|                                                     error rate | sort_country_code_no_can_match |     0           |       % |

Baseline with 100 shards:

|                                                 Min Throughput |              sort_country_code |      1.5         |   ops/s |
|                                                Mean Throughput |              sort_country_code |      1.51        |   ops/s |
|                                              Median Throughput |              sort_country_code |      1.51        |   ops/s |
|                                                 Max Throughput |              sort_country_code |      1.51        |   ops/s |
|                                        50th percentile latency |              sort_country_code |     21.4611      |      ms |
|                                        90th percentile latency |              sort_country_code |     22.1005      |      ms |
|                                        99th percentile latency |              sort_country_code |     22.5758      |      ms |
|                                       100th percentile latency |              sort_country_code |     22.6604      |      ms |
|                                   50th percentile service time |              sort_country_code |     20.0432      |      ms |
|                                   90th percentile service time |              sort_country_code |     20.4968      |      ms |
|                                   99th percentile service time |              sort_country_code |     20.9834      |      ms |
|                                  100th percentile service time |              sort_country_code |     21.0219      |      ms |
|                                                     error rate |              sort_country_code |      0           |       % |
|                                                 Min Throughput | sort_country_code_no_can_match |      1.5         |   ops/s |
|                                                Mean Throughput | sort_country_code_no_can_match |      1.51        |   ops/s |
|                                              Median Throughput | sort_country_code_no_can_match |      1.51        |   ops/s |
|                                                 Max Throughput | sort_country_code_no_can_match |      1.51        |   ops/s |
|                                        50th percentile latency | sort_country_code_no_can_match |     10.1881      |      ms |
|                                        90th percentile latency | sort_country_code_no_can_match |     10.7168      |      ms |
|                                        99th percentile latency | sort_country_code_no_can_match |     17.4883      |      ms |
|                                       100th percentile latency | sort_country_code_no_can_match |     23.6664      |      ms |
|                                   50th percentile service time | sort_country_code_no_can_match |      8.68162     |      ms |
|                                   90th percentile service time | sort_country_code_no_can_match |      9.11954     |      ms |
|                                   99th percentile service time | sort_country_code_no_can_match |     15.6349      |      ms |
|                                  100th percentile service time | sort_country_code_no_can_match |     21.7983      |      ms |
|                                                     error rate | sort_country_code_no_can_match |      0           |       % |

100 shards - with the fix:

|                                                 Min Throughput |              sort_country_code |      1.5         |   ops/s |
|                                                Mean Throughput |              sort_country_code |      1.51        |   ops/s |
|                                              Median Throughput |              sort_country_code |      1.51        |   ops/s |
|                                                 Max Throughput |              sort_country_code |      1.51        |   ops/s |
|                                        50th percentile latency |              sort_country_code |     12.3007      |      ms |
|                                        90th percentile latency |              sort_country_code |     12.8042      |      ms |
|                                        99th percentile latency |              sort_country_code |     13.1673      |      ms |
|                                       100th percentile latency |              sort_country_code |     13.1859      |      ms |
|                                   50th percentile service time |              sort_country_code |     10.6005      |      ms |
|                                   90th percentile service time |              sort_country_code |     11.0101      |      ms |
|                                   99th percentile service time |              sort_country_code |     11.4409      |      ms |
|                                  100th percentile service time |              sort_country_code |     11.5415      |      ms |
|                                                     error rate |              sort_country_code |      0           |       % |
|                                                 Min Throughput | sort_country_code_no_can_match |      1.5         |   ops/s |
|                                                Mean Throughput | sort_country_code_no_can_match |      1.51        |   ops/s |
|                                              Median Throughput | sort_country_code_no_can_match |      1.51        |   ops/s |
|                                                 Max Throughput | sort_country_code_no_can_match |      1.51        |   ops/s |
|                                        50th percentile latency | sort_country_code_no_can_match |     10.2294      |      ms |
|                                        90th percentile latency | sort_country_code_no_can_match |     10.7747      |      ms |
|                                        99th percentile latency | sort_country_code_no_can_match |     11.0563      |      ms |
|                                       100th percentile latency | sort_country_code_no_can_match |     11.1577      |      ms |
|                                   50th percentile service time | sort_country_code_no_can_match |      8.549       |      ms |
|                                   90th percentile service time | sort_country_code_no_can_match |      8.84923     |      ms |
|                                   99th percentile service time | sort_country_code_no_can_match |      9.21479     |      ms |
|                                  100th percentile service time | sort_country_code_no_can_match |      9.3753      |      ms |
|                                                     error rate | sort_country_code_no_can_match |      0           |       % |

My conclusion is that the regression is visible the more shards you have on the same data node. There is a slight impact of the regression in the query phase too, but it's not clearly visible because shards are queried in parallel.

I plan on merging this as soon as the nightly benchmarks include the additional challenges that I am adding to track performance of sorting by keyword and numeric.

@javanna javanna added the auto-backport Automatically create backport pull requests when merged label Jan 11, 2023
@javanna javanna merged commit 766e426 into elastic:main Jan 11, 2023
@javanna javanna deleted the fix/terms_min_max_io branch January 11, 2023 22:34
javanna added a commit to javanna/elasticsearch that referenced this pull request Jan 11, 2023
…#92026)

Whenever sorting on a date, numeric or keyword field (as primary sort), the can_match phase retrieves min and max for the field and sorts the shards (asc or desc depending on the sort order) so that they are going to be queried following that order. This allows incremental results to be exposed in that same order when using async search, as well as optimizations built on top of such behaviour (elastic#51852).

For fields with points we call `getMinPackedValue` and `getMaxPackedValue`, while for keyword fields we call `Terms#getMin` and `Terms#getMax`. Elasticsearch uses `FilterTerms` implementations to cancel queries as well as to track field usage. Such filter implementations should delegate their `getMin` and `getMax` calls to the wrapped `Terms` instance, which will leverage info from the block tree that caches min and max, otherwise they are always going to be retrieved from the index, which does I/O and slows the can_match phase down.
javanna added a commit that referenced this pull request Jan 12, 2023
…#92854)

Whenever sorting on a date, numeric or keyword field (as primary sort), the can_match phase retrieves min and max for the field and sorts the shards (asc or desc depending on the sort order) so that they are going to be queried following that order. This allows incremental results to be exposed in that same order when using async search, as well as optimizations built on top of such behaviour (#51852).

For fields with points we call `getMinPackedValue` and `getMaxPackedValue`, while for keyword fields we call `Terms#getMin` and `Terms#getMax`. Elasticsearch uses `FilterTerms` implementations to cancel queries as well as to track field usage. Such filter implementations should delegate their `getMin` and `getMax` calls to the wrapped `Terms` instance, which will leverage info from the block tree that caches min and max, otherwise they are always going to be retrieved from the index, which does I/O and slows the can_match phase down.
danielmitterdorfer pushed a commit to danielmitterdorfer/elasticsearch that referenced this pull request Jan 12, 2023
…#92026)

Whenever sorting on a date, numeric or keyword field (as primary sort), the can_match phase retrieves min and max for the field and sorts the shards (asc or desc depending on the sort order) so that they are going to be queried following that order. This allows incremental results to be exposed in that same order when using async search, as well as optimizations built on top of such behaviour (elastic#51852).

For fields with points we call `getMinPackedValue` and `getMaxPackedValue`, while for keyword fields we call `Terms#getMin` and `Terms#getMax`. Elasticsearch uses `FilterTerms` implementations to cancel queries as well as to track field usage. Such filter implementations should delegate their `getMin` and `getMax` calls to the wrapped `Terms` instance, which will leverage info from the block tree that caches min and max, otherwise they are always going to be retrieved from the index, which does I/O and slows the can_match phase down.
javanna added a commit to javanna/elasticsearch that referenced this pull request Jan 12, 2023
…#92026)

Whenever sorting on a date, numeric or keyword field (as primary sort), the can_match phase retrieves min and max for the field and sorts the shards (asc or desc depending on the sort order) so that they are going to be queried following that order. This allows incremental results to be exposed in that same order when using async search, as well as optimizations built on top of such behaviour (elastic#51852).

For fields with points we call `getMinPackedValue` and `getMaxPackedValue`, while for keyword fields we call `Terms#getMin` and `Terms#getMax`. Elasticsearch uses `FilterTerms` implementations to cancel queries as well as to track field usage. Such filter implementations should delegate their `getMin` and `getMax` calls to the wrapped `Terms` instance, which will leverage info from the block tree that caches min and max, otherwise they are always going to be retrieved from the index, which does I/O and slows the can_match phase down.
javanna added a commit that referenced this pull request Jan 12, 2023
Whenever sorting on a date, numeric or keyword field (as primary sort), the can_match phase retrieves min and max for the field and sorts the shards (asc or desc depending on the sort order) so that they are going to be queried following that order. This allows incremental results to be exposed in that same order when using async search, as well as optimizations built on top of such behaviour (#51852).

For fields with points we call `getMinPackedValue` and `getMaxPackedValue`, while for keyword fields we call `Terms#getMin` and `Terms#getMax`. Elasticsearch uses `FilterTerms` implementations to cancel queries as well as to track field usage. Such filter implementations should delegate their `getMin` and `getMax` calls to the wrapped `Terms` instance, which will leverage info from the block tree that caches min and max, otherwise they are always going to be retrieved from the index, which does I/O and slows the can_match phase down.
javanna added a commit to elastic/rally-tracks that referenced this pull request Jan 23, 2023
… and geonames tracks (#357)

We recently found a regression that affected searches sorted by a keyword field (elastic/elasticsearch#92026).

Given that we had no benchmarks for sorting by keyword, this commit adds the relevant operations to the http-logs and geonames tracks. We will want to also add similar challenges to the many-shards benchmarks, as the regressions we found can be seen with more than a couple of shards. This commit adds also specific challenges to verify the effect of elastic/elasticsearch#51852 when a search is sorted by numeric or timestamp.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto-backport Automatically create backport pull requests when merged >bug :Search/Search Search-related issues that do not fall into other categories Team:Search Meta label for search team v7.17.9 v8.6.1 v8.7.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants