-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Shortcut counts on exists queries #37475
Comments
Pinging @elastic/es-search |
@jpountz working on this, while adding tests to QueryPhaseTests, I discovered the following section in Lucenes IndexSearchers#count method: https://github.com/apache/lucene-solr/blob/master/lucene/core/src/java/org/apache/lucene/search/IndexSearcher.java#L339 |
Unfortunately this optimization is unsafe for Lucene because Lucene doesn't enforce that you are indexing the same data in an indexed and a doc-value field that share the same name. This is something that Elasticsearch enforces however: if a field is indexed and doc-valued, then indexed documents are exactly the same as doc-valued documents. |
Thanks, thats unfortunate. Do we wrap or overwrite Lucenes IndexSearch in ES in a consistent way somewhere so we could add this optimization only on our side? |
@cbuescher We never use IndexSearcher#count to my knowlegde (maybe we should make it a forbidden API in a separate change?). Counting only occurs in |
`TopDocsCollectorContext` can already shortcut hit counts on `match_all` and `term` queries when there are no deletions. This change adds this ability for `exists` queries if the index doesn't have deletions and fields are indexed. Closes #37475
`TopDocsCollectorContext` can already shortcut hit counts on `match_all` and `term` queries when there are no deletions. This change adds this ability for `exists` queries if the index doesn't have deletions and fields are indexed. Closes elastic#37475
`TopDocsCollectorContext` can already shortcut hit counts on `match_all` and `term` queries when there are no deletions. This change adds this ability for `exists` queries if the index doesn't have deletions and fields are indexed. Closes #37475
TopDocsCollectorContext
is already able to shortcut hit counts onmatch_all
andterm
queries when there are no deletions. It would be nice to also shortcutexists
queries, especially because running amatch_all
query on an index that has nested documents runs a DocValueFieldExists query under the hood on the_seq_no
field.This would only work on indices without deletions and fields that are indexed: fields indexed with terms can use
Terms#getDocCount
and fields indexed with points can usePointValues#getDocCount
.The text was updated successfully, but these errors were encountered: