Skip to content

Pull requests: stanford-crfm/helm

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

Refactor FLEURS audio scenario to ASR task
#3287 opened Jan 22, 2025 by ImKeTT Loading…
Add Legal Opinion Sentiment Classification scenario
#3286 opened Jan 22, 2025 by yifanmai Loading…
Adding IMDB_PTBR Scenario
#3284 opened Jan 22, 2025 by thallysonjsa Loading…
Include multiple annotators for WildBench
#3283 opened Jan 22, 2025 by liamjxu Loading…
Switch table_benchmark wikitq to use 1 shot instead of 5
#3280 opened Jan 21, 2025 by yifanmai Loading…
Use original instance IDs in IFEval
#3275 opened Jan 15, 2025 by yifanmai Loading…
MedHelm: Add VQA-RAD scenario and specs
#3246 opened Dec 24, 2024 by sashimono-san Loading…
IBM Enterprise Scenarios
#3064 opened Oct 16, 2024 by yifanmai Draft
Documentation: Evaluation run lifecycle
#2506 opened Mar 25, 2024 by yifanmai Loading…
Remove AdapterSpec from metrics
#2244 opened Jan 17, 2024 by yifanmai Draft
Numeracy scenario update
#1978 opened Nov 2, 2023 by friedeggs Loading…
ProTip! Adding no:label will show everything without a label.