-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Skeleton benchmark 1.0 #399
Conversation
MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As we discussed in the standup, let's go for more with this. In particular, I'm hoping to see most or all of the bullet points in the issue: #398
Oh, sorry, I didn't realize you had already bumped the modelgauge version in here when I started in on a PR for that. Let's get this merged and then maybe drop my PR if it's duplicative. |
Before I dive in to review, could you say how much of #398's bullet points are in this PR? |
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! Thanks for going the distance.
The primary difference between 0.5 and 1.0 seems to be the inclusion of additional languages. WG1 says scores from different languages should not be aggregated, so I envision each language to be it's own benchmark. This will require some re-factoring of modelgauge hazards as well.