-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create SQuAD metric README.md #3873
Conversation
Proposal for a metrics card structure (with an example based on the SQuAD metric). @thomwolf @lhoestq @douwekiela @lewtun -- feel free to comment on structure or content (it's an initial draft, so I realize there's stuff missing!).
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very cool!
Oh one last thing I almost forgot, I think I would add a section "Examples" with examples of inputs and outputs and in particular: an example giving maximal values, an examples giving minimal values and maybe a standard examples from SQuAD. What do you think? |
updating with @thomwolf 's suggestions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool thank you ! I also imagine that we could have a widget for people to play with the metric on the website somehow
Co-authored-by: Quentin Lhoest <[email protected]>
Co-authored-by: Quentin Lhoest <[email protected]>
Updating structure as per @lhoestq 's suggestion
Co-authored-by: lewtun <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is very cool and will really help people get a better understanding for how our metrics work!
I've left some tiny comments, but otherwise this looks great to me :)
metrics/squad/README.md
Outdated
{'exact_match': 66.66666666666667, 'f1': 66.66666666666667} | ||
|
||
## Limitations and bias | ||
This metric works only with the [SQuAD v.1 dataset](https://huggingface.co/datasets/squad) -- it will not work with any other dataset formats. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't it more accurate to say that this metric only works for datasets that have the same schema / format as SQuAD v1?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well, it's technically a dataset-specific metric, cause there are ids that are internal to SQuAD, but you're right that they don't get checked, really.
changing the code formatting as per @lewtun 's comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feel free to merge if it's all good for you :)
Proposal for a metrics card structure (with an example based on the SQuAD metric).
@thomwolf @lhoestq @douwekiela @lewtun -- feel free to comment on structure or content (it's an initial draft, so I realize there's stuff missing!).