Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[8.x] Even better(er) binary quantization (#117994) #118295

Merged
merged 5 commits into from
Dec 10, 2024

Conversation

benwtrent
Copy link
Member

Backport

This will backport the following commits from main to 8.x:

benwtrent and others added 2 commits December 9, 2024 12:33
This measurably improves BBQ by adjusting the underlying algorithm to an
optimized per vector scalar quantization.

This is a brand new way to quantize vectors. Instead of there being a
global set of upper and lower quantile bands, these are optimized and
calculated per individual vector. Additionally, vectors are centered on
a common centroid.

This allows for an almost 32x reduction in memory, and even better
recall than before at the cost of slightly increasing indexing time.

Additionally, this new approach is easily generalizable to various other
bit sizes (e.g. 2 bits, etc.). While not taken advantage of yet, we may
update our scalar quantized indices in the future to use this new
algorithm, giving significant boosts in recall.

The recall gains spread from 2% to almost 10% for certain datasets with
an additional 5-10% indexing cost when indexing with HNSW when compared
with current BBQ.
@benwtrent benwtrent added auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) and removed auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) labels Dec 9, 2024
@benwtrent benwtrent added the auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) label Dec 10, 2024
@elasticsearchmachine elasticsearchmachine merged commit ffc5978 into elastic:8.x Dec 10, 2024
15 checks passed
@benwtrent benwtrent deleted the backport/pr-117994 branch December 10, 2024 14:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) backport >enhancement :Search Relevance/Vectors Vector search v8.18.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants