-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ensure we return non-negative scores when scoring scalar dot-products #108522
Ensure we return non-negative scores when scoring scalar dot-products #108522
Conversation
Hi @benwtrent, I've created a changelog YAML for you. |
Pinging @elastic/es-search (Team:Search) |
@elasticmachine update branch |
The scorer should be doing the same as that of the lucene one. Why is this necessary? Also, should the lucene one do same (in lucene itself)? |
Since the individual vector values are positive, then is the negativity coming from the offsets or correction values? I think our tests should be updated to cover this. |
This is being fixed because of a failing test. I could add a manual one where vectors are I also have a PR open in Lucene to correct this there. But, here, we have our own scorer implementation that needs handling. One thing I should do here is do an |
…rent/elasticsearch into ensure-positive-scores-dot-product
@ChrisHegarty added a test & assertion that our dot product from the native impl is |
assertThat(scorer.score(0, 1), equalTo(expected)); | ||
assertThat((new VectorScorerSupplierAdapter(scorer)).scorer(0).score(1), equalTo(expected)); | ||
// cosine | ||
expected = 0f; // TODO fix in Lucene: https://github.com/apache/lucene/pull/13356 luceneScore(COSINE, vec1, vec2, 1, -5, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah ha. I see this now. Thanks
Thanks, this is helpful.
Ok cool. The point is that our scorer implementation should produce the same result as that of the Lucene implementation. I now see the separate Lucene PR, thanks.
Yeah. Since asserts bloat the bytecode, an alternative is to put the assert in the tests that are closer to the implementation, so in |
@elasticmachine update branch |
closes: #108339