Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Apply the date histogram rewrite optimization to range aggregation #13865

Merged
merged 42 commits into from
Jun 19, 2024

Conversation

bowenlan-amzn
Copy link
Member

@bowenlan-amzn bowenlan-amzn commented May 29, 2024

Description

Background

Previously we optimized date histogram aggregation by utilizing the index structure of date field — BKD tree/Points to provide the documents count of date histogram buckets results. We first build date ranges from the date histogram query input like daily interval histogram. Then we "query" the index of date for how many documents are in these date ranges.

Idea

Date is actually saved as numeric data of long data type. This leads to the idea that we can extend the optimization to the RangeAggregator which also perform ranges aggregation on numeric data.

Implementation

The core optimization algorithm remains the same. (Details #13317)
Note the algorithm has a hidden pre-condition: all the ranges are not overlapping (because date interval cannot produce overlapping date ranges)
Another difference is that range aggregation buildRange method won't be built from segment level match all path. The ranges are provided by user directly, not like date histogram needs to check the boundaries of shard or segment or accommodate top level range query. (Details #12073)

Changes
  • Covering all numeric types
  • buildRanges method now return a generic Ranges type instead of long[]
  • The ranges built now will be same as what provided by customer, meaning it's lower inclusive, upper exclusive. The multi range traversal logic is refactored to fit that. The refactor also include some readability improvements.
  • tryFastFilterAggregation common logic is extracted, and AggregationType should implement its own special logic.
Possible Follow Ups
  • Geo distance range query
  • Overlapping ranges
  • We notice scaled float doesn't seem to belong in the module, as it's not experimental anymore. So better to just move it along with NumberType in the server.
  • Currently the rewrite helper class is becoming huge, also the workflow to do the optimization (isRewriteable, buildRanges, tryxxxRewrite...) needs refactor. Tracking task

Benchmarks

|                                                        Metric |                          Task |    Baseline |   Contender |     Diff |   Unit |
|                                       50th percentile latency | articles_monthly_agg_uncached |      7.2067 |     9.99465 |  2.78795 |     ms |
|                                       90th percentile latency | articles_monthly_agg_uncached |      7.9746 |     11.4022 |  3.42756 |     ms |
|                                       99th percentile latency | articles_monthly_agg_uncached |      12.822 |     16.0743 |  3.25229 |     ms |
|                                      100th percentile latency | articles_monthly_agg_uncached |     24.4897 |      36.373 |  11.8833 |     ms |

However, considering this is already ~10ms. We can look into the reason later as a follow up. For the other workloads, big5, nyc, http, they are all even faster.

  • Performance gain on the range aggregation.
    big5
|                                                        Metric |                          Task |    Baseline |   Contender |     Diff |   Unit |
|                                       50th percentile latency | range-agg-1 | 1.23376e+06 |     4.74459 | -1.23376e+06 |     ms |
|                                       90th percentile latency | range-agg-1 | 1.42868e+06 |     5.14212 | -1.42867e+06 |     ms |
|                                       99th percentile latency | range-agg-1 | 1.47254e+06 |     5.32877 | -1.47253e+06 |     ms |
|                                      100th percentile latency | range-agg-1 | 1.47741e+06 |     5.58408 |  -1.4774e+06 |     ms |
|                                  50th percentile service time | range-agg-1 |     5420.99 |     3.49232 |      -5417.5 |     ms |
|                                  90th percentile service time | range-agg-1 |     5433.47 |     3.63754 |     -5429.83 |     ms |
|                                  99th percentile service time | range-agg-1 |     5440.81 |     3.73967 |     -5437.07 |     ms |
|                                 100th percentile service time | range-agg-1 |     5443.22 |     4.11546 |      -5439.1 |     ms |
|                                                    error rate | range-agg-1 |           0 |           0 |            0 |      % |

|                                       50th percentile latency | range-agg-2 | 1.20783e+06 |     6.01016 | -1.20782e+06 |     ms |
|                                       90th percentile latency | range-agg-2 | 1.39873e+06 |     6.42305 | -1.39872e+06 |     ms |
|                                       99th percentile latency | range-agg-2 | 1.44164e+06 |     6.69281 | -1.44163e+06 |     ms |
|                                      100th percentile latency | range-agg-2 | 1.44642e+06 |     6.69403 | -1.44641e+06 |     ms |
|                                  50th percentile service time | range-agg-2 |     5317.48 |     4.86202 |     -5312.62 |     ms |
|                                  90th percentile service time | range-agg-2 |     5333.16 |     4.94407 |     -5328.21 |     ms |
|                                  99th percentile service time | range-agg-2 |     5367.61 |     5.09159 |     -5362.52 |     ms |
|                                 100th percentile service time | range-agg-2 |     5417.87 |     5.11181 |     -5412.76 |     ms |
|                                                    error rate | range-agg-2 |           0 |           0 |            0 |      % |

noaa

|                                       50th percentile latency | range-aggregation |     1484.37 |     10.8027 | -1473.56 |     ms |
|                                       90th percentile latency | range-aggregation |     1489.36 |     11.3409 | -1478.02 |     ms |
|                                      100th percentile latency | range-aggregation |     1493.02 |     14.9022 | -1478.12 |     ms |
|                                  50th percentile service time | range-aggregation |     1481.79 |     6.98101 | -1474.81 |     ms |
|                                  90th percentile service time | range-aggregation |     1486.58 |     7.40678 | -1479.17 |     ms |
|                                 100th percentile service time | range-aggregation |     1490.47 |      10.768 |  -1479.7 |     ms |
|                                                    error rate | range-aggregation |           0 |           0 |        0 |      % |

Related Issues

Resolves #13531

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • API changes companion pull request created.
  • Failing checks are inspected and point to the corresponding known issue(s) (See: Troubleshooting Failing Builds)
  • Commits are signed per the DCO using --signoff
  • Commit changes are listed out in CHANGELOG.md file (See: Changelog)
  • Public documentation issue/PR created

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@github-actions github-actions bot added Search:Aggregations v2.15.0 Issues and PRs related to version 2.15.0 labels May 29, 2024
Signed-off-by: bowenlan-amzn <[email protected]>
@bowenlan-amzn bowenlan-amzn added the backport 2.x Backport to 2.x branch label May 29, 2024
Copy link
Contributor

❌ Gradle check result for c5d2175: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for ed79e02: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Signed-off-by: bowenlan-amzn <[email protected]>
support all numeric types
minus one on the upper range

Signed-off-by: bowenlan-amzn <[email protected]>
Copy link
Contributor

❌ Gradle check result for 8ec5f58: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for 67c281c: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for c10c775: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

github-actions bot commented Jun 2, 2024

❌ Gradle check result for 783b14a: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

github-actions bot commented Jun 3, 2024

❌ Gradle check result for 783b14a: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

github-actions bot commented Jun 3, 2024

✅ Gradle check result for 783b14a: SUCCESS

@bowenlan-amzn bowenlan-amzn marked this pull request as ready for review June 3, 2024 18:38
Copy link
Collaborator

@jainankitk jainankitk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@bowenlan-amzn
Copy link
Member Author

@github-actions commented on Jun 18, 2024, 11:17 PM PDT:

❌ Gradle check result for 07a5293: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Originally posted by @github-actions[bot] in #13865 (comment)

#14292

Copy link
Contributor

✅ Gradle check result for 9764b23: SUCCESS

@mch2 mch2 merged commit 57fb50b into opensearch-project:main Jun 19, 2024
31 checks passed
opensearch-trigger-bot bot pushed a commit that referenced this pull request Jun 19, 2024
…13865)

* Refactor the ranges representation

Signed-off-by: bowenlan-amzn <[email protected]>

* Refactor try fast filter

Signed-off-by: bowenlan-amzn <[email protected]>

* Main work finished; left the handling of different numeric data types

Signed-off-by: bowenlan-amzn <[email protected]>

* buildRanges accepts field type

Signed-off-by: bowenlan-amzn <[email protected]>

* first working draft probably

Signed-off-by: bowenlan-amzn <[email protected]>

* add change log

Signed-off-by: bowenlan-amzn <[email protected]>

* accommodate geo distance agg

Signed-off-by: bowenlan-amzn <[email protected]>

* Fix test

support all numeric types
minus one on the upper range

Signed-off-by: bowenlan-amzn <[email protected]>

* [Refactor] range is lower inclusive, right exclusive

Signed-off-by: bowenlan-amzn <[email protected]>

* adding test

Signed-off-by: bowenlan-amzn <[email protected]>

* Adding test and refactor

Signed-off-by: bowenlan-amzn <[email protected]>

* refactor

Signed-off-by: bowenlan-amzn <[email protected]>

* add test

Signed-off-by: bowenlan-amzn <[email protected]>

* add test and update the compare logic in tree traversal

Signed-off-by: bowenlan-amzn <[email protected]>

* fix test, add random test

Signed-off-by: bowenlan-amzn <[email protected]>

* refactor to address comments

Signed-off-by: bowenlan-amzn <[email protected]>

* small potential performance update

Signed-off-by: bowenlan-amzn <[email protected]>

* fix precommit

Signed-off-by: bowenlan-amzn <[email protected]>

* refactor

Signed-off-by: bowenlan-amzn <[email protected]>

* refactor

Signed-off-by: bowenlan-amzn <[email protected]>

* set refresh_interval to -1

Signed-off-by: bowenlan-amzn <[email protected]>

* address comment

Signed-off-by: bowenlan-amzn <[email protected]>

* address comment

Signed-off-by: bowenlan-amzn <[email protected]>

* address comment

Signed-off-by: bowenlan-amzn <[email protected]>

* Fix test

To understand fully about the double and bigdecimal usage in scaled float field will take more time.

Signed-off-by: bowenlan-amzn <[email protected]>

---------

Signed-off-by: bowenlan-amzn <[email protected]>
(cherry picked from commit 57fb50b)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
@bowenlan-amzn bowenlan-amzn deleted the 13531-range-agg branch June 19, 2024 22:54
mch2 pushed a commit that referenced this pull request Jun 20, 2024
…13865) (#14463)

* Refactor the ranges representation



* Refactor try fast filter



* Main work finished; left the handling of different numeric data types



* buildRanges accepts field type



* first working draft probably



* add change log



* accommodate geo distance agg



* Fix test

support all numeric types
minus one on the upper range



* [Refactor] range is lower inclusive, right exclusive



* adding test



* Adding test and refactor



* refactor



* add test



* add test and update the compare logic in tree traversal



* fix test, add random test



* refactor to address comments



* small potential performance update



* fix precommit



* refactor



* refactor



* set refresh_interval to -1



* address comment



* address comment



* address comment



* Fix test

To understand fully about the double and bigdecimal usage in scaled float field will take more time.



---------


(cherry picked from commit 57fb50b)

Signed-off-by: bowenlan-amzn <[email protected]>
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
@andrross
Copy link
Member

@bowenlan-amzn Maybe a test failure here related to this change: https://build.ci.opensearch.org/job/gradle-check/41411/consoleText

./gradlew ':rest-api-spec:yamlRestTest' --tests "org.opensearch.test.rest.ClientYamlTestSuiteIT" -Dtests.method="test {p0=search.aggregation/40_range/Double range profiler shows filter rewrite info}" -Dtests.seed=2F3C46DA8CEB59E9 -Dtests.security.manager=true -Dtests.jvm.argline="-XX:TieredStopAtLevel=1 -XX:ReservedCodeCacheSize=64m" -Dtests.locale=hu-HU -Dtests.timezone=America/Montreal -Druntime.java=21 

That will reproduce for me, though not every time

harshavamsi pushed a commit to harshavamsi/OpenSearch that referenced this pull request Jul 12, 2024
…pensearch-project#13865)

* Refactor the ranges representation

Signed-off-by: bowenlan-amzn <[email protected]>

* Refactor try fast filter

Signed-off-by: bowenlan-amzn <[email protected]>

* Main work finished; left the handling of different numeric data types

Signed-off-by: bowenlan-amzn <[email protected]>

* buildRanges accepts field type

Signed-off-by: bowenlan-amzn <[email protected]>

* first working draft probably

Signed-off-by: bowenlan-amzn <[email protected]>

* add change log

Signed-off-by: bowenlan-amzn <[email protected]>

* accommodate geo distance agg

Signed-off-by: bowenlan-amzn <[email protected]>

* Fix test

support all numeric types
minus one on the upper range

Signed-off-by: bowenlan-amzn <[email protected]>

* [Refactor] range is lower inclusive, right exclusive

Signed-off-by: bowenlan-amzn <[email protected]>

* adding test

Signed-off-by: bowenlan-amzn <[email protected]>

* Adding test and refactor

Signed-off-by: bowenlan-amzn <[email protected]>

* refactor

Signed-off-by: bowenlan-amzn <[email protected]>

* add test

Signed-off-by: bowenlan-amzn <[email protected]>

* add test and update the compare logic in tree traversal

Signed-off-by: bowenlan-amzn <[email protected]>

* fix test, add random test

Signed-off-by: bowenlan-amzn <[email protected]>

* refactor to address comments

Signed-off-by: bowenlan-amzn <[email protected]>

* small potential performance update

Signed-off-by: bowenlan-amzn <[email protected]>

* fix precommit

Signed-off-by: bowenlan-amzn <[email protected]>

* refactor

Signed-off-by: bowenlan-amzn <[email protected]>

* refactor

Signed-off-by: bowenlan-amzn <[email protected]>

* set refresh_interval to -1

Signed-off-by: bowenlan-amzn <[email protected]>

* address comment

Signed-off-by: bowenlan-amzn <[email protected]>

* address comment

Signed-off-by: bowenlan-amzn <[email protected]>

* address comment

Signed-off-by: bowenlan-amzn <[email protected]>

* Fix test

To understand fully about the double and bigdecimal usage in scaled float field will take more time.

Signed-off-by: bowenlan-amzn <[email protected]>

---------

Signed-off-by: bowenlan-amzn <[email protected]>
kkewwei pushed a commit to kkewwei/OpenSearch that referenced this pull request Jul 24, 2024
…pensearch-project#13865) (opensearch-project#14463)

* Refactor the ranges representation

* Refactor try fast filter

* Main work finished; left the handling of different numeric data types

* buildRanges accepts field type

* first working draft probably

* add change log

* accommodate geo distance agg

* Fix test

support all numeric types
minus one on the upper range

* [Refactor] range is lower inclusive, right exclusive

* adding test

* Adding test and refactor

* refactor

* add test

* add test and update the compare logic in tree traversal

* fix test, add random test

* refactor to address comments

* small potential performance update

* fix precommit

* refactor

* refactor

* set refresh_interval to -1

* address comment

* address comment

* address comment

* Fix test

To understand fully about the double and bigdecimal usage in scaled float field will take more time.

---------

(cherry picked from commit 57fb50b)

Signed-off-by: bowenlan-amzn <[email protected]>
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Signed-off-by: kkewwei <[email protected]>
wdongyu pushed a commit to wdongyu/OpenSearch that referenced this pull request Aug 22, 2024
…pensearch-project#13865)

* Refactor the ranges representation

Signed-off-by: bowenlan-amzn <[email protected]>

* Refactor try fast filter

Signed-off-by: bowenlan-amzn <[email protected]>

* Main work finished; left the handling of different numeric data types

Signed-off-by: bowenlan-amzn <[email protected]>

* buildRanges accepts field type

Signed-off-by: bowenlan-amzn <[email protected]>

* first working draft probably

Signed-off-by: bowenlan-amzn <[email protected]>

* add change log

Signed-off-by: bowenlan-amzn <[email protected]>

* accommodate geo distance agg

Signed-off-by: bowenlan-amzn <[email protected]>

* Fix test

support all numeric types
minus one on the upper range

Signed-off-by: bowenlan-amzn <[email protected]>

* [Refactor] range is lower inclusive, right exclusive

Signed-off-by: bowenlan-amzn <[email protected]>

* adding test

Signed-off-by: bowenlan-amzn <[email protected]>

* Adding test and refactor

Signed-off-by: bowenlan-amzn <[email protected]>

* refactor

Signed-off-by: bowenlan-amzn <[email protected]>

* add test

Signed-off-by: bowenlan-amzn <[email protected]>

* add test and update the compare logic in tree traversal

Signed-off-by: bowenlan-amzn <[email protected]>

* fix test, add random test

Signed-off-by: bowenlan-amzn <[email protected]>

* refactor to address comments

Signed-off-by: bowenlan-amzn <[email protected]>

* small potential performance update

Signed-off-by: bowenlan-amzn <[email protected]>

* fix precommit

Signed-off-by: bowenlan-amzn <[email protected]>

* refactor

Signed-off-by: bowenlan-amzn <[email protected]>

* refactor

Signed-off-by: bowenlan-amzn <[email protected]>

* set refresh_interval to -1

Signed-off-by: bowenlan-amzn <[email protected]>

* address comment

Signed-off-by: bowenlan-amzn <[email protected]>

* address comment

Signed-off-by: bowenlan-amzn <[email protected]>

* address comment

Signed-off-by: bowenlan-amzn <[email protected]>

* Fix test

To understand fully about the double and bigdecimal usage in scaled float field will take more time.

Signed-off-by: bowenlan-amzn <[email protected]>

---------

Signed-off-by: bowenlan-amzn <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Backport to 2.x branch Search:Aggregations v2.16.0 Issues and PRs related to version 2.16.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Apply the fast filter optimization to range aggregation
5 participants