Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: evaluate openai, cohere and google models in CUREv1 #65

Merged

Conversation

dbuades
Copy link
Contributor

@dbuades dbuades commented Dec 9, 2024

As a follow-up to #55, this PR evaluates the following models on CUREv1:

  • name: "openai/text-embedding-3-small"
  • name: "openai/text-embedding-3-large"
  • name: "Cohere/Cohere-embed-english-v3.0"
  • name: "Cohere/Cohere-embed-english-light-v3.0"
  • name: "Cohere/Cohere-embed-multilingual-v3.0"
  • name: "Cohere/Cohere-embed-multilingual-light-v3.0"
  • name: "google/text-embedding-005"
  • name: "google/text-multilingual-embedding-002"

These results were obtained using the code from the following two PRs: embeddings-benchmark/mteb#1562 and embeddings-benchmark/mteb#1564.

As a side note, we would like to run these models on the remaining tasks in the MTEB(Medical) benchmark. However, we initially held off due to API cost constraints. Do you have access to credits with these providers that we could use for this purpose? Alternatively, would it be possible for you to run them on your side?

Checklist

  • Run tests locally to make sure nothing is broken using make test.
  • Run the results files checker make pre-push.

@Samoed
Copy link
Collaborator

Samoed commented Dec 9, 2024

@Muennighoff ran OpenAI models. I think he might have some credits left

@KennethEnevoldsen KennethEnevoldsen merged commit 5f6f731 into embeddings-benchmark:main Dec 10, 2024
2 checks passed
@Muennighoff
Copy link
Contributor

@dbuades I've sent you an API key :)

@dbuades
Copy link
Contributor Author

dbuades commented Dec 10, 2024

Leaderboard 2.0: Missing results embeddings-benchmark/mteb#1571

Well received, thank you! 🥳

@dbuades
Copy link
Contributor Author

dbuades commented Dec 10, 2024

@Muennighoff following up on your comment here, I assume we are good reusing results from the MTEB(Medical) tasks that already have results for revision 1 of the models, correct?
This way, I’d only need to run the openai models on tasks that don’t already have results for revision 1, since by default mteb would rerun the models on all tasks since we are now on revision 2.

@Muennighoff
Copy link
Contributor

@Muennighoff following up on your comment here, I assume we are good reusing results from the MTEB(Medical) tasks that already have results for revision 1 of the models, correct? This way, I’d only need to run the openai models on tasks that don’t already have results for revision 1, since by default mteb would rerun the models on all tasks since we are now on revision 2.

Yeah no need to rerun results I think!

@dbuades dbuades deleted the feat/CUREv1-closed-models branch December 10, 2024 23:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants