-
Notifications
You must be signed in to change notification settings - Fork 136
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KNNIterators should support with and without filters #2155
KNNIterators should support with and without filters #2155
Conversation
Signed-off-by: Vijayan Balasubramanian <[email protected]>
public ByteVectorIdsKNNIterator( | ||
final BitSet filterIdsBitSet, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we make filterIdsBitSet as Optional? or atleast annotate as @nullable to provide a signal that this parameter can be null.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ack
Update VectorIterator and NesterVector Iterator to iterate even if there is no filters provided to iterator. Currently this is used by exact search to score either topk docs or all docs when filter is provided by users. However, in future we will be allowing exact search even if there are no filters. Hence, decouple filter and make it option to support both cases. Signed-off-by: Vijayan Balasubramanian <[email protected]>
Signed-off-by: Vijayan Balasubramanian <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the refactoring.
src/main/java/org/opensearch/knn/index/query/iterators/VectorIdsKNNIterator.java
Outdated
Show resolved
Hide resolved
7b53e0c
to
71dba07
Compare
this.docId = getNextDocId(); | ||
} | ||
|
||
ByteVectorIdsKNNIterator(final byte[] queryVector, final KNNBinaryVectorValues binaryVectorValues, final SpaceType spaceType) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ByteVectorIdsKNNIterator(final byte[] queryVector, final KNNBinaryVectorValues binaryVectorValues, final SpaceType spaceType) | |
public ByteVectorIdsKNNIterator(final byte[] queryVector, final KNNBinaryVectorValues binaryVectorValues, final SpaceType spaceType) |
super(filterIdsArray, queryVector, binaryVectorValues, spaceType); | ||
this.parentBitSet = parentBitSet; | ||
} | ||
|
||
NestedByteVectorIdsKNNIterator( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NestedByteVectorIdsKNNIterator( | |
public NestedByteVectorIdsKNNIterator( |
private final BitSet parentBitSet; | ||
|
||
NestedFilteredIdsKNNIterator( | ||
final BitSet filterIdsArray, | ||
NestedVectorIdsKNNIterator( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NestedVectorIdsKNNIterator( | |
public NestedVectorIdsKNNIterator( |
this(filterIdsArray, queryVector, knnFloatVectorValues, spaceType, parentBitSet, null, null); | ||
} | ||
|
||
public NestedFilteredIdsKNNIterator( | ||
NestedVectorIdsKNNIterator( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NestedVectorIdsKNNIterator( | |
public NestedVectorIdsKNNIterator( |
src/main/java/org/opensearch/knn/index/query/iterators/VectorIdsKNNIterator.java
Outdated
Show resolved
Hide resolved
src/main/java/org/opensearch/knn/index/query/iterators/VectorIdsKNNIterator.java
Outdated
Show resolved
Hide resolved
@@ -55,10 +58,13 @@ public int nextDoc() throws IOException { | |||
if (docId == DocIdSetIterator.NO_MORE_DOCS) { | |||
return DocIdSetIterator.NO_MORE_DOCS; | |||
} | |||
int doc = binaryVectorValues.advance(docId); | |||
if (bitSetIterator != null) { | |||
binaryVectorValues.advance(docId); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we move this line inside getNextDocId()
? See line#84 for example.
|
||
protected int getNextDocId() throws IOException { | ||
if (bitSetIterator != null) { | ||
return this.bitSetIterator.nextDoc(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's remove this
for consistency. this
is mostly needed only inside constructor to distinguish the variable from parameter.
return this.bitSetIterator.nextDoc(); | |
int docId = bitSetIterator.nextDoc(); | |
binaryVectorValues.advance(docId); | |
return docId; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@heemin32 It is missing NO_DOCUMENTS check. Please check my response here #2155 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am good with having NO_DOCUMENTS check here if needed.
Signed-off-by: Vijayan Balasubramanian <[email protected]>
71dba07
to
a9ab6cd
Compare
Signed-off-by: Vijayan Balasubramanian <[email protected]>
a9ab6cd
to
71adc4f
Compare
Signed-off-by: Vijayan Balasubramanian <[email protected]>
* Rename class names to represent both and filter and non filter use cases * Iterator should support with filters Update VectorIterator and NesterVector Iterator to iterate even if there is no filters provided to iterator. Currently this is used by exact search to score either topk docs or all docs when filter is provided by users. However, in future we will be allowing exact search even if there are no filters. Hence, decouple filter and make it option to support both cases. --------- Signed-off-by: Vijayan Balasubramanian <[email protected]> (cherry picked from commit 6f6dd56)
* Rename class names to represent both and filter and non filter use cases * Iterator should support with filters Update VectorIterator and NesterVector Iterator to iterate even if there is no filters provided to iterator. Currently this is used by exact search to score either topk docs or all docs when filter is provided by users. However, in future we will be allowing exact search even if there are no filters. Hence, decouple filter and make it option to support both cases. --------- Signed-off-by: Vijayan Balasubramanian <[email protected]> (cherry picked from commit 6f6dd56) Co-authored-by: Vijayan Balasubramanian <[email protected]>
Description
Update VectorIterator, ByteIterator, NestedVectorIterator, NestedByteIterator to iterate even if there are no filters provided.
Currently this is used by exact search to score either topK docs or all docs when filter is provided by users.
However, in future we will be allowing exact search even if there are no filters. Hence, decouple filter
and make it option to support both cases.
Related Issues
Pre-requisite for #1942
Check List
--signoff
.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.