-
Notifications
You must be signed in to change notification settings - Fork 126
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] [META] Add Efficient filtering support for Faiss Engine #903
Comments
Hi, is this issue still open for contribution? |
@TrungBui59 all the tasks have been completed for this issue. I am waiting for the opensearch 2.10 release to happen once that is done this issue will be closed. But there are several others related to this issue which you can contribute. You can also view issues with tags good first issue and help wanted. |
@navneet1v , Does it work in a similar way for neural query as well? Or it's just for "kNN" query? |
Yes this works in same for neural query clause also.. |
Thank you @navneet1v
|
@savanbthakkar I see that in your query you have not provided To understand more can you provide below details:
This can help me understand what happening in the backend. |
Is there a limit (or a practical limitation) on maximum number of IDs which can be extracted for a filter from lucene and passed to FAISS - both from lucene side and FAISS side? |
@igniting there is no such limit on both the engines |
@igniting are you facing any issues related to number of ids that are getting extracted for filters? |
No, I've not yet tested it - but we have scenarios when the matched filter set can result in millions of ids. |
There is no limit but the query will be slower with more filtered ids. We don't have the benchmark data between request latency and number of filtered ids. |
Thanks for such a clear write-up in the PR description!
What determines which segments get loaded? Is it all segments? Presumably for pre-filtering there could be an optimisation to not load in the vector segments that are filtered out, but for efficient filtering, presumably all of them need to be loaded into memory (since by definition we can't know ahead of time which ones we might need to load (without first having performed a KNN))? If there any docs on this that you can point me to then I'd really appreciate it. |
Pre filtering and efficient filtering is same. Every segments need to get loaded into memory for KNN search even with post-filtering because filtering is happening in document level but not segment level. |
Introduction
This Issue provides a high level overview on Efficient Filtering support for K-NN native engines. Due to the limited support filtering in Native Engines we will be enabling Efficient filtering in Faiss only.
Background
The OpenSearch K-NN plugin supports 3 different type of Engines to perform the Approximate Nearest Neighbor Search. Engines is just an abstraction provided by the plugin over what downstream libraries we will use to do the Nearest Neighbor Search. Currently we have Lucene(Java Implementation), Faiss(C++ implementation) and Nmslib(C++ implementation) as 3 different engines.
Every engine supports various algorithms do the Search. On high level we support:
For more details you can read this documentation: https://opensearch.org/docs/latest/search-plugins/knn/knn-index/
K-NN Architecture For Native Engines(Indexing and Search)
On a very high level, an OpenSearch index data is stored in Shards. Shards are nothing but Lucene Indexes. Each shard contains segments. These segments are immutable once they are created. For indices that have K-NN fields in it the architecture is same. K-NN plugin uses the same architecture to support the ANN search. At a very high level during the segment creation, apart from creating all the different data structures(Like FST, BKDs, DocValues etc) needed for different fields, for a K-NN field for HNSW algorithm K-NN plugin creates the HNSW graph files per K-NN Vector Field. These files are written down as segment files.
While performing the ANN Search, we load these HNSW files from disk into Memory(not JVM Heap) if not present already and then perform the Search using respective Libraries.
Filtering
As of OpenSearch 2.8 version K-NN plugin support various types of filters(ref doc). But the type of filtering that we are proposing as part of this document is only present with Lucene engine. None of the Native Engines(Faiss and NmsLib) doesn’t support it.
What is filtering and why it is important?
Filtering is a way that lets users to restrict their search to a certain part of their data. In terms of vector search: The goal is to return the nearest neighbor for a query(containing query vector and filter) among the data points that satisfy the filter specified with query. Lets try to understand this with an example specific to vector search:
Let’s say we have an index where we are storing the product catalog, where images are represented as vectors, along with the same index we are storing the ratings, date when it was uploaded, total review etc. The customer of that application is trying to search for similar products(providing that as a vector) but wants only that products which has Rating >= 4.
So, to honor such queries we need filtering along with Vector Search.
What is Efficient Filtering?
In Filtering world, there are basically 2 types of filtering, lets try to understand then in terms of Vector Search:
Efficient Filtering: Efficient filtering is an improvement over pre-filtering especially, where main idea
Note: The ideas discussed in this section around efficient filtering is not the only things we are considering in efficient filtering. Detail explanations are provided in the doc over the same, which are specific to OpenSearch K-NN Plugin.
Why we need Filtering in Native Engines?
There are couple of limitation with Lucene because of which we want add Filtering in Native engines:
Requirement
Implement Efficient Filtering for Native Engines(along with different engines algorithms) of OpenSearch K-NN plugins.
Out Of Scope
Success Criteria
High Level Flow
Below is a simplified flow/logic on how filtered search will work.
Filtered Search Flow: Users create a k-NN query with filter and calls the OpenSearch _search api to do the query with filter. The query follows the standard Query Route at the Coordinator node and routed to different shards(can be present on different or same Data Nodes). On the data node, the query is written into Lucene based Query Interface for every shard after performing validation checks(like does filter is supported by this KNN engine or not.). On every shard this query gets executed via IndexSearchers. The IndexSearcher runs the K-NN Query on each Shard, where each K-NN Query will build the a filterQuery(Lucene Constant Score Query) and run the query to return weight of the query. This filter weight is passed to KNNWeight where:
Once the results are returned standard OpenSearch Query flows happens, where a list of docIds of length “size”, is send from each shard to coordinator, which then compile the top docs Ids (of length == size parameter) and runs the fetch phase to get more information on these docIds.
Example
Create Index
Ingest Documents
Query With Filters
Query Without Filters
Task:
The text was updated successfully, but these errors were encountered: