Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Add namaa MrTydi reranking dataset #1573

Conversation

omarelshehy
Copy link
Contributor

Why this dataset:

1 - Add to the reranking tasks exclusively for arabic
2 - Utilize the test dataset for MrTydi with generated and human-evaluated negatives.

Checklist

  • Run tests locally to make sure nothing is broken using make test.
  • Run the formatter to format the code using make lint.

Adding datasets checklist

Reason for dataset addition: ...

  • I have run the following models on the task (adding the results to the pr). These can be run using the mteb -m {model_name} -t {task_name} command.
    • cross-encoder/ms-marco-MiniLM-L-12-v2
    • cross-encoder/stsb-TinyBERT-L-4
  • I have checked that the performance is neither trivial (both models gain close to perfect scores) nor random (both models gain close to random scores).
  • If the dataset is too big (e.g. >2048 examples), considering using self.stratified_subsampling() under dataset_transform()
  • I have filled out the metadata object in the dataset file (find documentation on it here).
  • Run tests locally to make sure nothing is broken using make test.
  • Run the formatter to format the code using make lint.

@omarelshehy omarelshehy marked this pull request as ready for review December 9, 2024 17:39
@Samoed
Copy link
Collaborator

Samoed commented Dec 9, 2024

Did your dataset add new data to the original MrTidy? The original MrTidy is already included in MTEB. Also, could you provide results for these tasks to ensure it's working correctly? It seems like the data is being loaded in a different format than expected

@omarelshehy
Copy link
Contributor Author

omarelshehy commented Dec 9, 2024

I might be mistaken, but the Mrtydi dataset was included there for retrieval and not reranking. We basically took the test dataset from MrTydi and added 4-5 negatives to each query and positive (which the original doesn't have). For the formatting I relied on similar Reranking dataset structures. Here is also the results of the two models in the PR description
NamaaMrTydiReranking_ms-marco-MiniLM.json
NamaaMrTydiReranking_stsb_TinyBERT.json

@Samoed
Copy link
Collaborator

Samoed commented Dec 9, 2024

Ah, yes. You are right

Copy link
Contributor

@KennethEnevoldsen KennethEnevoldsen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The metadata seems to be lacking a bit, I have suggested some updates.

mteb/tasks/Reranking/ara/NamaaMrTydiReranking.py Outdated Show resolved Hide resolved
mteb/tasks/Reranking/ara/NamaaMrTydiReranking.py Outdated Show resolved Hide resolved
mteb/tasks/Reranking/ara/NamaaMrTydiReranking.py Outdated Show resolved Hide resolved
mteb/tasks/Reranking/ara/NamaaMrTydiReranking.py Outdated Show resolved Hide resolved
mteb/tasks/Reranking/ara/NamaaMrTydiReranking.py Outdated Show resolved Hide resolved
mteb/tasks/Reranking/ara/NamaaMrTydiReranking.py Outdated Show resolved Hide resolved
mteb/tasks/Reranking/ara/NamaaMrTydiReranking.py Outdated Show resolved Hide resolved
mteb/tasks/Reranking/ara/NamaaMrTydiReranking.py Outdated Show resolved Hide resolved
@omarelshehy omarelshehy requested a review from Samoed December 11, 2024 21:49
@KennethEnevoldsen KennethEnevoldsen changed the title Add namaa MrTydi reranking dataset fix: Add namaa MrTydi reranking dataset Dec 11, 2024
@KennethEnevoldsen KennethEnevoldsen enabled auto-merge (squash) December 11, 2024 23:09
@KennethEnevoldsen KennethEnevoldsen merged commit 7b9b3c9 into embeddings-benchmark:main Dec 11, 2024
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants