| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| PAWS-X (test) | Sigmoid Head | BCE0.325 | 36 | 1mo ago | |
| WMT en-de 22 | COMET Kiwi | Pearson R0.722 | 15 | 1mo ago | |
| WMT 24 | Sigmoid Head | Pearson Correlation0.606 | 12 | 1mo ago | |
| Mandarin (zh-CN) dialect sentences (test) | NANO | Success Rate84 | 11 | 1mo ago | |
| Portuguese (pt-BR) dialect sentences (test) | NANO | Success Rate86 | 11 | 1mo ago | |
| CoNLL 2014 (test) | GRECO | Pearson Correlation (rho)0.445 | 10 | 1mo ago | |
| En-Ml | ALOPE-RL | Pearson r0.583 | 9 | 1mo ago | |
| WMT en-es 24 | Sigmoid Head | Pearson Correlation0.672 | 8 | 1mo ago | |
| WMT en-de 24 | Sigmoid Head | Pearson Correlation0.606 | 8 | 1mo ago | |
| ParaCrawl | COMET Kiwi | Pearson Correlation0.537 | 8 | 1mo ago | |
| PAWS-X | Sigmoid Head | BCE0.883 | 8 | 1mo ago | |
| English-Indic QE Tourism domain | QE Score (En-Hi)0.737 | 6 | 1mo ago | ||
| Legal domain English-Indic QE | En-Ta QE Score0.749 | 6 | 1mo ago | ||
| Healthcare domain English-Indic QE | QE Score (En-Hi)0.611 | 6 | 1mo ago | ||
| English-Indic QE General domain | QE Score (En-Hi)0.563 | 6 | 1mo ago | ||
| WMT 2020 (test) | NANO | QE Score (en-cs)71.8 | 6 | 1mo ago | |
| WMT RU-EN 2021 (test) | DAG 1 | Pearson Correlation47.16 | 5 | 22d ago | |
| WMT RO-EN 2021 (test) | DAG 2 | Pearson Correlation84.4 | 5 | 22d ago | |
| WMT EN-ZH 2021 (test) | DAG 2 | Pearson Correlation0.366 | 5 | 22d ago | |
| WMT EN-DE 2021 (test) | DAG 1 | Pearson Correlation51.9 | 5 | 22d ago | |
| Surrey Low-Resource dataset (Overall) | TransQuest | Spearman Correlation0.592 | 4 | 1mo ago | |
| WMT en-ru 24 | COMET Kiwi | Pearson R0.625 | 4 | 1mo ago | |
| WMT en-cn 24 | Sigmoid Head | Pearson Correlation0.537 | 4 | 1mo ago | |
| WMT en-ko 24 | COMET Kiwi | Pearson Correlation0.626 | 4 | 1mo ago | |
| WMT en-it 24 | COMET Kiwi | Pearson R0.588 | 4 | 1mo ago |