Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Date Histogram] Apply the optimization to Composite aggregation with source as DateHistogram #11301

Closed
bowenlan-amzn opened this issue Nov 22, 2023 · 0 comments · Fixed by #11505
Assignees
Labels
Search:Aggregations Search:Performance v2.12.0 Issues and PRs related to version 2.12.0

Comments

@bowenlan-amzn
Copy link
Member

bowenlan-amzn commented Nov 22, 2023

This issue is to track the effort on investigating how to apply the fast filter optimization proposed in #9310 to composite histogram (comp-agg).

The targeted scenario is when only one source which is a date histogram
Note that composite agg support pagination and the default size is 10 which can be customized by the size param and continue next page with after key.

How composite aggregation works?

  • Optimization using the leading source (first aggregation in comp-agg)
    Based on the sorted data structure of the leading source including Point and TermsEnum, we can early terminate after enough buckets processed.
  • Optimization using the index sorting
    If the sources of comp-agg match index sorting and an afterKey provided, we can use SearchAfterSortedDocQuery to produce a iterator that including the documents where we should continue the aggregation
  • Normal case
    We will try to collect every sources' values per every document, add into the composite aggregation queue if it's competitive. Without existing sorted things, we cannot do much optimization rather than collect one by one and try push into a priority queue/heap.
  • Deferring collection for sub aggregation
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Search:Aggregations Search:Performance v2.12.0 Issues and PRs related to version 2.12.0
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

2 participants