-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Even better(er) binary quantization #117994
Even better(er) binary quantization #117994
Conversation
Pinging @elastic/es-search-relevance (Team:Search Relevance) |
Hi @benwtrent, I've created a changelog YAML for you. |
@@ -48,13 +48,8 @@ class ES816BinaryFlatVectorsScorer implements FlatVectorsScorer { | |||
public RandomVectorScorerSupplier getRandomVectorScorerSupplier( | |||
VectorSimilarityFunction similarityFunction, | |||
KnnVectorValues vectorValues | |||
) throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removing write path for old codec
RandomVectorScorerSupplier getRandomVectorScorerSupplier( | ||
VectorSimilarityFunction similarityFunction, | ||
ES816BinaryQuantizedVectorsWriter.OffHeapBinarizedQueryVectorValues scoringVectors, | ||
BinarizedByteVectorValues targetVectors | ||
) { | ||
return new BinarizedRandomVectorScorerSupplier(scoringVectors, targetVectors, similarityFunction); | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removing write path for old codec
@Override | ||
public String toString() { | ||
return "ES816BinaryFlatVectorsScorer(nonQuantizedDelegate=" + nonQuantizedDelegate + ")"; | ||
} | ||
|
||
/** Vector scorer supplier over binarized vector values */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removing write path for old codec
return new ES816BinaryQuantizedVectorsWriter(scorer, rawVectorFormat.fieldsWriter(state), state); | ||
throw new UnsupportedOperationException(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removing write path for old codec
@@ -25,10 +25,8 @@ | |||
import org.apache.lucene.codecs.hnsw.FlatVectorsFormat; | |||
import org.apache.lucene.codecs.lucene99.Lucene99HnswVectorsFormat; | |||
import org.apache.lucene.codecs.lucene99.Lucene99HnswVectorsReader; | |||
import org.apache.lucene.codecs.lucene99.Lucene99HnswVectorsWriter; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removing write path for old codec
/** | ||
* Copied from Lucene, replace with Lucene's implementation sometime after Lucene 10 | ||
*/ | ||
public class ES816BinaryQuantizedRWVectorsFormat extends ES816BinaryQuantizedVectorsFormat { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For old codec backward compat testing
@@ -63,7 +63,7 @@ protected Codec getCodec() { | |||
return new Lucene100Codec() { | |||
@Override | |||
public KnnVectorsFormat getKnnVectorsFormatForField(String field) { | |||
return new ES816BinaryQuantizedVectorsFormat(); | |||
return new ES816BinaryQuantizedRWVectorsFormat(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For old codec backward compat testing
@@ -77,7 +77,7 @@ class ES816BinaryQuantizedVectorsWriter extends FlatVectorsWriter { | |||
private final List<FieldWriter> fields = new ArrayList<>(); | |||
private final IndexOutput meta, binarizedVectorData; | |||
private final FlatVectorsWriter rawVectorDelegate; | |||
private final ES816BinaryFlatVectorsScorer vectorsScorer; | |||
private final ES816BinaryFlatRWVectorsScorer vectorsScorer; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For old codec backward compat testing
import static org.apache.lucene.codecs.lucene99.Lucene99HnswVectorsFormat.DEFAULT_MAX_CONN; | ||
import static org.apache.lucene.codecs.lucene99.Lucene99HnswVectorsFormat.DEFAULT_NUM_MERGE_WORKER; | ||
|
||
class ES816HnswBinaryQuantizedRWVectorsFormat extends ES816HnswBinaryQuantizedVectorsFormat { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For old codec backward compat testing
@@ -59,7 +59,7 @@ protected Codec getCodec() { | |||
return new Lucene100Codec() { | |||
@Override | |||
public KnnVectorsFormat getKnnVectorsFormatForField(String field) { | |||
return new ES816HnswBinaryQuantizedVectorsFormat(); | |||
return new ES816HnswBinaryQuantizedRWVectorsFormat(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For old codec backward compat testing
…-binary-quantization
…benwtrent/elasticsearch into feature/even-better-binary-quantization
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
double xe = 0.0; | ||
double e = 0.0; | ||
for (double xi : vector) { | ||
double xiq = (a + step * Math.round((clamp(xi, a, b) - a) * stepInv)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeap, this seems about right 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LOL, I should indeed add some comments here. We are basically calculating the error of quantizing and then unquantizing and shifting up or down based on the mis-calculation.
…-binary-quantization
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed tests and the main set of changes against the 816*
versions and they LGTM. The math changes are a bit more difficult to review; but will give it a more thorough go on Monday 😅 (no need to wait though; please feel free to proceed w/o waiting for my review on that part)
@benwtrent and Tom V. amazing work! It would be nice to add some documentation to the format: it looks like the queries are still 4 bits quantized? |
The on disk format is very similar. I can add some docs on that to the format |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@benwtrent Thanks Ben! Great work!
Similar to Panos, I reviewed file formats, as don't follow all the math in quantizer but I trust you and Tom got it right.
💔 Backport failed
You can use sqren/backport to manually backport by running |
This measurably improves BBQ by adjusting the underlying algorithm to an optimized per vector scalar quantization. This is a brand new way to quantize vectors. Instead of there being a global set of upper and lower quantile bands, these are optimized and calculated per individual vector. Additionally, vectors are centered on a common centroid. This allows for an almost 32x reduction in memory, and even better recall than before at the cost of slightly increasing indexing time. Additionally, this new approach is easily generalizable to various other bit sizes (e.g. 2 bits, etc.). While not taken advantage of yet, we may update our scalar quantized indices in the future to use this new algorithm, giving significant boosts in recall. The recall gains spread from 2% to almost 10% for certain datasets with an additional 5-10% indexing cost when indexing with HNSW when compared with current BBQ.
* Even better(er) binary quantization (#117994) This measurably improves BBQ by adjusting the underlying algorithm to an optimized per vector scalar quantization. This is a brand new way to quantize vectors. Instead of there being a global set of upper and lower quantile bands, these are optimized and calculated per individual vector. Additionally, vectors are centered on a common centroid. This allows for an almost 32x reduction in memory, and even better recall than before at the cost of slightly increasing indexing time. Additionally, this new approach is easily generalizable to various other bit sizes (e.g. 2 bits, etc.). While not taken advantage of yet, we may update our scalar quantized indices in the future to use this new algorithm, giving significant boosts in recall. The recall gains spread from 2% to almost 10% for certain datasets with an additional 5-10% indexing cost when indexing with HNSW when compared with current BBQ. * fixing backport * fixing test
This measurably improves BBQ by adjusting the underlying algorithm to an optimized per vector scalar quantization.
This is a brand new way to quantize vectors. Instead of there being a global set of upper and lower quantile bands, these are optimized and calculated per individual vector. Additionally, vectors are centered on a common centroid.
This allows for an almost 32x reduction in memory, and even better recall than before at the cost of slightly increasing indexing time.
Additionally, this new approach is easily generalizable to various other bit sizes (e.g. 2 bits, etc.). While not taken advantage of yet, we may update our scalar quantized indices in the future to use this new algorithm, giving significant boosts in recall.
The recall gains spread from 2% to almost 10% for certain datasets with an additional 5-10% indexing cost when indexing with HNSW when compared with current BBQ.