-
Notifications
You must be signed in to change notification settings - Fork 128
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Encapsulate dimension, vector data type validation/processing inside Library #1957
Encapsulate dimension, vector data type validation/processing inside Library #1957
Conversation
Introduces a new configuration object for KNNMethodContext, KNNMethodConfigContext. The KNNMethodContext contains the user provided information for they want their index to be built like. However, it is missing a few pieces that are defined outside of it. However, these pieces are needed to validate the config and actually build the config. This new object gives the code a chance to pass that along and handle branching inside the engine definition. Signed-off-by: John Mazanec <[email protected]>
Integrates validation of KNNMethodContexts with the optional values of KNNMethodConfigContext. This simplifies the code in the mapper pretty significantly. Signed-off-by: John Mazanec <[email protected]>
Signed-off-by: John Mazanec <[email protected]>
Signed-off-by: John Mazanec <[email protected]>
Signed-off-by: John Mazanec <[email protected]>
Signed-off-by: John Mazanec <[email protected]>
Signed-off-by: John Mazanec <[email protected]>
389dff8
to
5011f16
Compare
Signed-off-by: John Mazanec <[email protected]>
Signed-off-by: John Mazanec <[email protected]>
Signed-off-by: John Mazanec <[email protected]>
Signed-off-by: John Mazanec <[email protected]>
706bbaa
to
dcf67d9
Compare
Signed-off-by: John Mazanec <[email protected]>
Signed-off-by: John Mazanec <[email protected]>
Signed-off-by: John Mazanec <[email protected]>
8ba2c9c
to
b1f2124
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good to me. Just 1 comment on the BWC at the codec level
@@ -243,18 +242,6 @@ private void createKNNIndexFromScratch(FieldInfo fieldInfo, KNNCodecUtil.Pair pa | |||
); | |||
} | |||
|
|||
// Update index description of Faiss for binary data type | |||
if (KNNEngine.FAISS == knnEngine |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what will happen if segment is an old segment. I think we will still need this update for old segments. We might need to find a way to check if the index description is correct for binary index and if it not correct then update it otherwise don't update the index description.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point - forgot about this. Fixed
@navneet1v ah I forgot about that - good catch. Let me fix |
7adbb2b
to
4b2871d
Compare
if (getBWCVersion().isPresent() && Version.fromString(maybeRemoveSnapshotSuffix(getBWCVersion().get())).before(Version.V_2_16_0)) { | ||
return; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is another way to do this skipping through brazil.gradle. wondering why we choose this way?
Ref on how to skip via brazil.gradle
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed it
c49de85
to
a95577e
Compare
Signed-off-by: John Mazanec <[email protected]>
a95577e
to
72f9ed4
Compare
@jmazanec15 seems like a build failure.. can you check once whats happening |
@navneet1v I dont see anything interesting in the logs and am not able to reproduce it locally. Im guessing its unrelated/flaky. Captured the logs: https://gist.github.com/jmazanec15/ec985b4aeeae3325d5159974cfb375ab. Ill retry |
The backport to
To backport manually, run these commands in your terminal: # Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-2.x 2.x
# Navigate to the new working tree
cd .worktrees/backport-2.x
# Create a new branch
git switch --create backport/backport-1957-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 f42e86eaaa8d4c7b6e0da30326a1215312560b0a
# Push it to GitHub
git push --set-upstream origin backport/backport-1957-to-2.x
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-2.x Then, create a pull request where the |
…Library (opensearch-project#1957) Introduces a new configuration object KNNMethodConfigContext. The KNNMethodContext contains the user provided information for want their index to be built like. However, it is missing a few pieces that are defined outside of it. These pieces are needed to validate the config and actually build the config. This change Integrates validation of KNNMethodContexts with KNNMethodConfigContext and better encapsulates KNNLibrary config info in the engine package Signed-off-by: John Mazanec <[email protected]> Signed-off-by: Akash Shankaran <[email protected]>
Description
As part of #1779 , we need to have the ability to take a user config for a float based space type (i.e. l2) and configure the quantization framework to go from float to bit and build a faiss binary index. In the current implementation, the data type validation is outside of the KNNLibrary/engine abstractions. This will make it difficult do the complex configuration around the quantization framework index building and search. The main theme of this PR is to move towards handling Library specific configuration to the library.
This PR does the following:
KNNMethodConfigContext
that is passed toKNNLibrary
for validation as well as processing. TheKNNMethodConfigContext
object has information outside of what is passed by the user for configuring the method. This will allow theKNNLibrary
specific components to do more complex validation/processing logic such as configuring a binary index, or figuring out how to validate/process incoming vectors. This wasnt added as a part ofKNNMethodContext
because it would bloat the class further. This is somewhat of an extension to theVectorSpaceInfo
abstractionKNNMethodConfigContext
Check List
--signoff
.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.