Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add new arctic v2.0 models #1574

Merged

Conversation

dbuades
Copy link
Contributor

@dbuades dbuades commented Dec 9, 2024

This PR introduces the following two newly released models by the Snowflake team:

  • name: "Snowflake/snowflake-arctic-embed-m-v2.0"
    revision: "f2a7d59d80dfda5b1d14f096f3ce88bb6bf9ebdc"

  • name: "Snowflake/snowflake-arctic-embed-l-v2.0"
    revision: "edc2df7b6c25794b340229ca082e7c78782e6374"

For supported languages, I did my best to translate them to ISO_LANGUAGE_SCRIPT as accurately as possible but there might be some errors. The original list of supported languages can be found in the README.md on their repository.

Checklist

  • Run tests locally to make sure nothing is broken using make test.
  • Run the formatter to format the code using make lint.

Adding a model checklist

  • I have filled out the ModelMeta object to the extent possible
  • I have ensured that my model can be loaded using
    • mteb.get_model(model_name, revision) and
    • mteb.get_model_meta(model_name, revision)
  • I have tested the implementation works on a representative set of tasks.

@dbuades
Copy link
Contributor Author

dbuades commented Dec 9, 2024

Results for MTEB(Medical) are in embeddings-benchmark/results#66.

@dbuades
Copy link
Contributor Author

dbuades commented Dec 9, 2024

The test error is unrelated to this PR, it's a gateway time-out for the Hub. A rerun of the action should fix it 👌

@KennethEnevoldsen
Copy link
Contributor

For supported languages, I did my best to translate them to ISO_LANGUAGE_SCRIPT as accurately as possible but there might be some errors. The original list of supported languages can be found in the README.md on their repository.

This is perfectly fine! Sadly Huggingface uses the two-letter codes, which is not as precise.

@KennethEnevoldsen KennethEnevoldsen merged commit 53756ad into embeddings-benchmark:main Dec 10, 2024
10 checks passed
@dbuades dbuades deleted the feat/arctic-v2-models branch December 10, 2024 20:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants