-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Do not generate empty buckets for the date histogram #89070
Do not generate empty buckets for the date histogram #89070
Conversation
If the date histogram interval is large and the 'fixed_interval' parameter is very small we might end up with a large number of buckets in the resulting histogram, in case we also generate empty buckets. As a result of this we might generate too many buckets (max date - min date) / fixed_interval > 65536 (roughly).. Here we set minDocCount to 1 so to avoid generation of empty buckets. In the test the maximum value for 'docCount' is 9000 which means, in the worsta case we generate 9000 documents, each belonging to a different bucket. In this case we would have 9000 buckets maximum which is well below the default maximum number of buckets allowed by default.
Pinging @elastic/es-analytics-geo (Team:Analytics) |
Hi @salvatore-campagna, I've created a changelog YAML for you. |
I tested this running the test locally "until failure". After more than 1000 executions I don't see any failure. Before the patch I could see the failure fairly quickly after a few tens executions (depending on random values). |
@elasticsearchmachine update branch |
@elasticsearchmachine update branch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
I left a comment about labelling this PR. You do not need another review after fixing this
@elasticsearchmachine test this please |
@elasticsearchmachine update branch |
@elasticsearchmachine run elasticsearch-ci/part-2 |
If the date histogram interval is large and the 'fixed_interval'
parameter is very small we might end up with a large number of
buckets in the resulting histogram, in case we also generate empty
buckets. Roughly (max date - min date) / fixed_interval > 65536.
Here we set minDocCount to 1 so to avoid generation of empty buckets.
In the test the maximum value for 'docCount' is 9000 which means,
in the worst case, we generate 9000 documents, each belonging to a
different bucket. In the worst case we would have 9000 buckets
maximum which is well below the maximum number of buckets
allowed by default (65536).
Resolves #88800.