-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add truncation support in evaluators #2582
Conversation
Hello! Thanks a bunch for this PR! I've extended it in the following two ways:
|
Additionally, I've extended the existing matryoshka NLI training scripts to show how to use the EmbeddingSimilarityEvaluator with
(I considered using a separate PR for that, but I think this fits as well) I'm planning on merging this today or early tomorrow, with the intention of using your Matryoshka-focused improvements as the headliner for a new v2.7.0 release. I've manually tested each of the evaluators by running an example training script for each of them, but in the long term we definitely want to add proper tests for them. For this PR, I think we're okay without.
|
Wow, great improvements, and thank you for testing the evaluators! Also very excited to see #2449 land |
Awesome! Also, as for the v2.7.0 release that I mentioned: I think I will postpone that for now. I think this PR should be good to go now though! Thanks a bunch for setting this up :)
|
Hello,
This PR is a follow-up to #2573. As you suggested in this comment, a
truncate_dim
parameter makes it easy to construct a sequential evaluator.How has this been tested?
It hasn't 🥴
I tested
EmbeddingSimilarityEvaluator
in #2573 by running a notebook offline. I'll think about how to test this change soon. Lmk what you think would make for a sufficient test, e.g., offline runs or actual tests intests/test_evaluator.py
.