Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up collecting zero document string terms #110922

Merged
merged 8 commits into from
Jul 18, 2024

Conversation

iverase
Copy link
Contributor

@iverase iverase commented Jul 16, 2024

I think in most cases we have ordinals so let's use the ordinals to collect the terms. This can be much faster, specially in low cardinality fields. In addition we are optimizing for single value BinaryDocValues.

this change is inspired by this discuss forum chat: https://discuss.elastic.co/t/elasticsearch-slow-query-with-min-doc-count-0-on-field-aggregation

@elasticsearchmachine elasticsearchmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Jul 16, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

@elasticsearchmachine
Copy link
Collaborator

Hi @iverase, I've created a changelog YAML for you.

@iverase iverase merged commit 50a7a08 into elastic:main Jul 18, 2024
15 checks passed
@iverase iverase deleted the zeroDocStringTerms branch July 18, 2024 14:12
iverase added a commit to iverase/elasticsearch that referenced this pull request Jul 18, 2024
Use segment ordinals when possible to collect zero document buckets
elasticsearchmachine pushed a commit that referenced this pull request Jul 18, 2024
Use segment ordinals when possible to collect zero document buckets
ioanatia pushed a commit to ioanatia/elasticsearch that referenced this pull request Jul 22, 2024
Use segment ordinals when possible to collect zero document buckets
salvatore-campagna pushed a commit to salvatore-campagna/elasticsearch that referenced this pull request Jul 23, 2024
Use segment ordinals when possible to collect zero document buckets
salvatore-campagna pushed a commit to salvatore-campagna/elasticsearch that referenced this pull request Jul 23, 2024
Use segment ordinals when possible to collect zero document buckets
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/Aggregations Aggregations >enhancement Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v8.15.1 v8.16.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants