feat: evaluate openai, cohere and google models in CUREv1 #65

dbuades · 2024-12-09T09:51:26Z

As a follow-up to #55, this PR evaluates the following models on CUREv1:

name: "openai/text-embedding-3-small"
name: "openai/text-embedding-3-large"
name: "Cohere/Cohere-embed-english-v3.0"
name: "Cohere/Cohere-embed-english-light-v3.0"
name: "Cohere/Cohere-embed-multilingual-v3.0"
name: "Cohere/Cohere-embed-multilingual-light-v3.0"
name: "google/text-embedding-005"
name: "google/text-multilingual-embedding-002"

These results were obtained using the code from the following two PRs: embeddings-benchmark/mteb#1562 and embeddings-benchmark/mteb#1564.

As a side note, we would like to run these models on the remaining tasks in the MTEB(Medical) benchmark. However, we initially held off due to API cost constraints. Do you have access to credits with these providers that we could use for this purpose? Alternatively, would it be possible for you to run them on your side?

Checklist

Run tests locally to make sure nothing is broken using make test.
Run the results files checker make pre-push.

Samoed · 2024-12-09T10:16:00Z

@Muennighoff ran OpenAI models. I think he might have some credits left

Muennighoff · 2024-12-10T19:10:00Z

@dbuades I've sent you an API key :)

dbuades · 2024-12-10T20:51:18Z

Leaderboard 2.0: Missing results embeddings-benchmark/mteb#1571

Well received, thank you! 🥳

dbuades · 2024-12-10T23:12:50Z

@Muennighoff following up on your comment here, I assume we are good reusing results from the MTEB(Medical) tasks that already have results for revision 1 of the models, correct?
This way, I’d only need to run the openai models on tasks that don’t already have results for revision 1, since by default mteb would rerun the models on all tasks since we are now on revision 2.

Muennighoff · 2024-12-10T23:15:53Z

@Muennighoff following up on your comment here, I assume we are good reusing results from the MTEB(Medical) tasks that already have results for revision 1 of the models, correct? This way, I’d only need to run the openai models on tasks that don’t already have results for revision 1, since by default mteb would rerun the models on all tasks since we are now on revision 2.

Yeah no need to rerun results I think!

dbuades added 2 commits December 7, 2024 20:31

feat: evaluate openai, cohere and google models in CUREv1

a2ffc70

Merge branch 'main' into feat/CUREv1-closed-models

1d041f3

Samoed approved these changes Dec 9, 2024

View reviewed changes

KennethEnevoldsen merged commit 5f6f731 into embeddings-benchmark:main Dec 10, 2024
2 checks passed

KennethEnevoldsen mentioned this pull request Dec 10, 2024

Leaderboard 2.0: Missing results embeddings-benchmark/mteb#1571

Open

dbuades deleted the feat/CUREv1-closed-models branch December 10, 2024 23:25

dbuades mentioned this pull request Dec 13, 2024

feat: evaluate openai models on the remaining MTEB(Medical) tasks #71

Merged

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: evaluate openai, cohere and google models in CUREv1 #65

feat: evaluate openai, cohere and google models in CUREv1 #65

dbuades commented Dec 9, 2024

Samoed commented Dec 9, 2024

Muennighoff commented Dec 10, 2024

dbuades commented Dec 10, 2024

dbuades commented Dec 10, 2024

Muennighoff commented Dec 10, 2024

feat: evaluate openai, cohere and google models in CUREv1 #65

feat: evaluate openai, cohere and google models in CUREv1 #65

Conversation

dbuades commented Dec 9, 2024

Checklist

Samoed commented Dec 9, 2024

Muennighoff commented Dec 10, 2024

dbuades commented Dec 10, 2024

dbuades commented Dec 10, 2024

Muennighoff commented Dec 10, 2024