A collection of machine translation benchmarks. Currently it includes:
- WMT news translation
- WMT21 European Low Resource Languages
- Tico19 tanslation benchmark
- Test sets from the Tatoeba translation challenge
- FLORES1 data sets
- FLORES-101/FLORES200 dev and devtest
- multi30k test sets
- TICO-19 is made publicly available through a Creative Commons CC0 license.
- FLORES-101 and FLORES200 are licenced under CC-BY-SA