-
Notifications
You must be signed in to change notification settings - Fork 501
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add documentation changes for disk-based k-NN #8246
Add documentation changes for disk-based k-NN #8246
Conversation
Thank you for submitting your PR. The PR states are In progress (or Draft) -> Tech review -> Doc review -> Editorial review -> Merged. Before you submit your PR for doc review, make sure the content is technically accurate. If you need help finding a tech reviewer, tag a maintainer. When you're ready for doc review, tag the assignee of this PR. The doc reviewer may push edits to the PR directly or leave comments and editorial suggestions for you to address (let us know in a comment if you have a preference). The doc reviewer will arrange for an editorial review. |
Signed-off-by: John Mazanec <[email protected]>
Signed-off-by: John Mazanec <[email protected]>
Signed-off-by: John Mazanec <[email protected]>
Signed-off-by: John Mazanec <[email protected]>
42b4d6b
to
dd988f1
Compare
Signed-off-by: John Mazanec <[email protected]>
|
||
Right now, 2 modes are supported: | ||
* `in_memory` (default) - the `in_memory` mode represents the current default for vector search in OpenSearch. By default, it will use the `nmslib` engine and not configure any compression_level. This mode should be preferred if low-latency is required for your application. | ||
* `on_disk` - the `on_disk` mode is used to provide low-cost vector search while maintaining strong recall. The `on_disk` mode by default uses `32x` compression via binary quantization and a default rescoring oversample factor of 2.0. This mode should be used if the workload requires a lower cost. `on_disk` is only supported for `float` vector types. Because `on_disk` mode requires quantization with re-scoring, the `1x` compression level cannot be used. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: A table would be nice and consistent with existing documentation. The headings can be, mode, engines supported (highlight the default here), compression supported (highlight the default here) and then guidance.
* `on_disk` - the `on_disk` mode is used to provide low-cost vector search while maintaining strong recall. The `on_disk` mode by default uses `32x` compression via binary quantization and a default rescoring oversample factor of 2.0. This mode should be used if the workload requires a lower cost. `on_disk` is only supported for `float` vector types. Because `on_disk` mode requires quantization with re-scoring, the `1x` compression level cannot be used. | |
* `on_disk` - the `on_disk` mode is used to provide low-cost vector search while maintaining strong recall. The `on_disk` mode by default uses `32x` compression via binary quantization and a default rescoring oversample factor of 3.0. This mode should be used if the workload requires a lower cost. `on_disk` is only supported for `float` vector types. Because `on_disk` mode requires quantization with re-scoring, the `1x` compression level cannot be used. |
} | ||
``` | ||
|
||
The `oversample_factor` is a floating point number between 0.0 and 100.0. `oversample_factor*k` will always be greater than or equal to 100 and less than or equal to 10,000. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Worth mentioning the defaults again here just incase someone is skimming through and directly jumps on to this section
Signed-off-by: Fanit Kolchina <[email protected]>
} | ||
}, | ||
"mappings": { | ||
"properties": { | ||
"my_vector": { | ||
"type": "knn_vector", | ||
"dimension": 3, | ||
"space_type": "l2", | ||
"mode": "in_memory", | ||
"compression_level": "2x", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why we are putting a 2x as default compression here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I figured Id show all the parameters and what they look like
"engine": "lucene", | ||
"parameters": { | ||
"ef_construction": 128, | ||
"m": 24 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[See if you want to update this] : we can reduce these hyper parameter values to 100, 16.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should I just not specify?
} | ||
}, | ||
"mappings": { | ||
"properties": { | ||
"my_vector": { | ||
"type": "knn_vector", | ||
"dimension": 3, | ||
"space_type": "l2", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think on this example we should give a best default experience. Which is no mode, no compression, just spaceType, dim and type attributes. What you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure - only thing is that I believe defaults will be picked up from index_settings in this case.
`compression_level` is a string-based mapping parameter that selects a quantization encoder that will reduce the memory consumption of the vectors by the given factor. Valid values are: | ||
- `1x` (supported by nmslib, lucene and faiss engines) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we put this in a table
|
||
For example, if a `32x` `compression_level` is passed for a `float32` index of 768-dimensional vectors, the per-vector memory should drop from `4*768` = 3072 bytes to `3072/32` = 846 bytes. Internally, binary quantization (which maps a float to a bit) may be used to achieve this. | ||
|
||
If the `compression_level` parameter is set, an `encoder` cannot be specifed in the `method` mapping. `compression_level` greater than `1x` are only supported for `float` vector types. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let put this as a note.
@@ -47,6 +47,28 @@ PUT test-index | |||
``` | |||
{% include copy-curl.html %} | |||
|
|||
## Vector workload modes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we have table of mode, compression and which engine will be used in the docs?
Signed-off-by: Fanit Kolchina <[email protected]>
Signed-off-by: John Mazanec <[email protected]>
Signed-off-by: Fanit Kolchina <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jmazanec15 @kolchfa-aws Please see my comments and changes and let me know if you have any questions. I'd like to reread lines 237 and 354 in api.md and line 86 in knn-index.md before approving. Thanks!
| `4x` | No default rescoring | | ||
| `2x` | No default rescoring | | ||
|
||
To explicitly apply rescoring, provide the `rescore` parameter in a query on a quantized index and specify the `oversample_factor`: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To explicitly apply rescoring, provide the `rescore` parameter in a query on a quantized index and specify the `oversample_factor`: | |
To explicitly apply rescoring, provide the `rescore` parameter in a quantized index query and specify the `oversample_factor`: |
Co-authored-by: Nathan Bower <[email protected]> Signed-off-by: kolchfa-aws <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Thanks @natebower and @kolchfa-aws! |
* Add space type as top level Signed-off-by: John Mazanec <[email protected]> * Add new rescore parameter Signed-off-by: John Mazanec <[email protected]> * Add new rescore parameter Signed-off-by: John Mazanec <[email protected]> * add docs for compression and mode Signed-off-by: John Mazanec <[email protected]> * Clean up compression docs Signed-off-by: John Mazanec <[email protected]> * Doc review Signed-off-by: Fanit Kolchina <[email protected]> * Update a few things Signed-off-by: John Mazanec <[email protected]> * Doc review Signed-off-by: Fanit Kolchina <[email protected]> * Apply suggestions from code review Co-authored-by: Nathan Bower <[email protected]> Signed-off-by: kolchfa-aws <[email protected]> --------- Signed-off-by: John Mazanec <[email protected]> Signed-off-by: Fanit Kolchina <[email protected]> Signed-off-by: kolchfa-aws <[email protected]> Co-authored-by: Fanit Kolchina <[email protected]> Co-authored-by: kolchfa-aws <[email protected]> Co-authored-by: Nathan Bower <[email protected]> Signed-off-by: Noah Staveley <[email protected]>
Description
Part of #8075, this PR adds documentation for the disk-based feature for OpenSearch k-NN. See opensearch-project/k-NN#1779.
First, to support this project, we had to allow
space_type
in the k-NN mapping to be configured in the root level mapping of theknn_vector
field. So, space type can be specified in one of 2 ways:I updated this.
Next, we added functionality to execute a rescore phase of the k-NN search to improve search on quantized indices. To add this:
I updated this.
Lastly, we introduced new parameters to the k-NN vector field mapping called
mode
andcompression_level
. These 2 parameters, when set, will configure the default parameter resolution of the field, which enables us to give strong out of box experience for multiple different work load skew.in_memory
is the default mode and maps to our current defaults.on_disk
is a new mode that adds default quantization and rescoring so that k-NN can run with strong recall performance in low-memory environments.As we are close to the release, I wanted to get this PR up.
Issues Resolved
closes #8075
Version
2.17 and beyong
Frontend features
N/A
Checklist
For more information on following Developer Certificate of Origin and signing off your commits, please check here.