| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Machine Translation | WMT En-De 2014 (test) | BLEU31.69 | 379 | |
| Machine Translation | WMT En-Fr 2014 (test) | BLEU67.88 | 237 | |
| Machine Translation | WMT English-German 2014 (test) | BLEU31.4 | 136 | |
| Machine Translation | WMT 2014 (test) | BLEU45.6 | 100 | |
| Machine Translation | WMT En-De '14 | BLEU30.15 | 89 | |
| Machine Translation | WMT Ro-En 2016 (test) | BLEU37.8 | 82 | |
| Machine Translation | WMT14 En-De newstest2014 (test) | BLEU30.4 | 65 | |
| Unconditional Text Generation | EMNLP 2017 WMT News | Perplexity36.11 | 64 | |
| Machine Translation | WMT De-En 14 (test) | BLEU34.19 | 59 | |
| Machine Translation | WMT 2016 (test) | BLEU41.37 | 58 | |
| Machine Translation | WMT16 English-German (test) | BLEU41.2 | 58 | |
| Machine Translation | WMT16 EN-RO (test) | BLEU39.1 | 56 | |
| Machine Translation | WMT en-fr 14 | BLEU Score45 | 56 | |
| Machine Translation Evaluation | WMT Metrics Shared Task 2024 | SPA85.5 | 52 | |
| Machine Translation | WMT24++ v1.0 (test) | XCOMET Score90 | 49 | |
| Machine Translation | WMT En-De 2017 (test) | BLEU Score0.307 | 46 | |
| Machine Translation | WMT En-Fr newstest 2014 (test) | BLEU43.4 | 46 | |
| Machine Translation | WMT En-De (newstest2014) | BLEU31.26 | 43 | |
| Machine Translation | WMT En-Fr 2014 | BLEU43.8 | 42 | |
| Machine Translation | WMT English-French 2014 (test) | BLEU45.6 | 41 | |
| Judge Alignment | WMT ZH-EN | Pairwise Accuracy62.4 | 40 | |
| Machine Translation | WMT En-Ro 2016 (test) | BLEU35.4 | 39 | |
| Machine Translation | WMT14 English-French (newstest2014) | BLEU45.9 | 39 | |
| Machine Translation | WMT16 German-English (test) | BLEU40.6 | 39 | |
| Machine Translation | WMT En-De 2019 (test) | SacreBLEU93 | 37 |