-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add SimilarityRanker to Haystack 2.0 #5923
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some comments. nothing worrying 😊
:param device: torch device (for example, cuda:0, cpu, mps) to limit model inference to a specific device. | ||
""" | ||
torch_and_transformers_import.check() | ||
super().__init__() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to call super()
cc @sjrl |
|
||
def __init__( | ||
self, | ||
model_name_or_path: Union[str, Path] = "cross-encoder/ms-marco-MiniLM-L-6-v2", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would just go with model
to be fair, type hints and documentation can do the rest.
Thanks for the work on this! I have a few more comments about potential additional features.
|
@sjrl let's keep track of these enhancement suggestions (actually existing V1 features :-) ), but for now, I propose we integrate the simplest implementation we currently have so we can experiment and build demos. |
PR is awaiting #5945 more specifically, Document unfreezing. Current test failures are expected. |
6a410f9
to
f278786
Compare
@sjrl and @silvanocerza would you give it one last pass please? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! Only one minor comment about additional test.
Should be gtg now, please have another look @sjrl |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm glad that helped catch a bug!
Why:
Ranking by similarity to a query is essential. Enter
SimilarityRanker
.What:
Added
SimilarityRanker
to therankers
package. It's all about evaluating document relevance based on similarity.How can it be used:
Here's a quick dive into its usage:
How did you test it:
Unit tests are in place. Also, ran the above snippet. Worked as expected.
Notes For Reviewer:
Reviewer, please do a deep dive into the similarity metrics and integration. Let's ensure this fits seamlessly into Haystack's architecture.