| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| PAWS-X (test) | Sigmoid Head | BCE0.325 | 36 | 4d ago | |
| WMT en-de 22 | COMET Kiwi | Pearson R0.722 | 15 | 4d ago | |
| WMT 24 | Sigmoid Head | Pearson Correlation0.606 | 12 | 4d ago | |
| Mandarin (zh-CN) dialect sentences (test) | NANO | Success Rate84 | 11 | 4d ago | |
| Portuguese (pt-BR) dialect sentences (test) | NANO | Success Rate86 | 11 | 4d ago | |
| CoNLL 2014 (test) | GRECO | Pearson Correlation (rho)0.445 | 10 | 4d ago | |
| En-Ml | ALOPE-RL | Pearson r0.583 | 9 | 4d ago | |
| WMT en-es 24 | Sigmoid Head | Pearson Correlation0.672 | 8 | 4d ago | |
| WMT en-de 24 | Sigmoid Head | Pearson Correlation0.606 | 8 | 4d ago | |
| ParaCrawl | COMET Kiwi | Pearson Correlation0.537 | 8 | 4d ago | |
| PAWS-X | Sigmoid Head | BCE0.883 | 8 | 4d ago | |
| WMT 2020 (test) | NANO | QE Score (en-cs)71.8 | 6 | 4d ago | |
| WMT en-ru 24 | COMET Kiwi | Pearson R0.625 | 4 | 4d ago | |
| WMT en-cn 24 | Sigmoid Head | Pearson Correlation0.537 | 4 | 4d ago | |
| WMT en-ko 24 | COMET Kiwi | Pearson Correlation0.626 | 4 | 4d ago | |
| WMT en-it 24 | COMET Kiwi | Pearson R0.588 | 4 | 4d ago | |
| WMT en-nl 24 | COMET Kiwi | Pearson Correlation0.552 | 4 | 4d ago | |
| WMT en-pt 24 | COMET Kiwi | Pearson Correlation0.533 | 4 | 4d ago | |
| WMT en-fr 24 | COMET Kiwi | Pearson Correlation0.53 | 4 | 4d ago | |
| BioMQM | Tower | QE Score (en-pt)0.067 | 4 | 4d ago | |
| TruthfulQA | BCE1.674 | 4 | 4d ago | ||
| GSM8k | Sigmoid Head | BCE0.642 | 4 | 4d ago | |
| TruthfulQA (test) | Sigmoid Head | BCE1.698 | 3 | 4d ago | |
| GSM8K (test) | Sigmoid Head | BCE Loss0.642 | 3 | 4d ago | |
| En-Ml (test) | ALOPE-RL | Pearson r0.334 | 3 | 4d ago |