Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[8.8] Add some docs explaining filter performance and behavior for HNSW (#110108) #110139

Merged
merged 1 commit into from
Jun 25, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 18 additions & 0 deletions docs/reference/search/search-your-data/knn-search.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -284,6 +284,24 @@ post-filtering approach, where the filter is applied **after** the approximate
kNN search completes. Post-filtering has the downside that it sometimes
returns fewer than k results, even when there are enough matching documents.

[discrete]
[[approximate-knn-search-and-filtering]]
==== Approximate kNN search and filtering

Unlike conventional query filtering, where more restrictive filters typically lead to faster queries,
applying filters in an approximate kNN search with an HNSW index can decrease performance.
This is because searching the HNSW graph requires additional exploration to obtain the `num_candidates`
that meet the filter criteria.

To avoid significant performance drawbacks, Lucene implements the following strategies per segment:

* If the filtered document count is less than or equal to num_candidates, the search bypasses the HNSW graph and
uses a brute force search on the filtered documents.

* While exploring the HNSW graph, if the number of nodes explored exceeds the number of documents that satisfy the filter,
the search will stop exploring the graph and switch to a brute force search over the filtered documents.


[discrete]
==== Combine approximate kNN with other features

Expand Down
Loading