| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| WMT Metrics Shared Task 2024 | MBR-SCORESIM | SPA85.5 | 52 | 4d ago | |
| WMT MQM Segment-level 22 | MetricsX-XXL | Score (En-De)60.1 | 19 | 4d ago | |
| WMT MQM System-level 22 | EAPrompt | Overall Score91.2 | 19 | 4d ago | |
| WMT segment-level 2019 (test) | BERTScore | Pearson R44.5 | 19 | 4d ago | |
| TAC summary-level 2008-2011 (test) | FrugalScore | Pearson Correlation (Pyramid)67.3 | 19 | 4d ago | |
| WMT MQM 2022 (test) | Remedy-R | Accuracy (System, 3 LPs)91.6 | 16 | 4d ago | |
| WMT 2023 (test) | Distribution-Calibrated Aggregation | MAE (EN→DE)0.588 | 12 | 4d ago | |
| MSLC OOD 24 | XCOMET | MT Empty Score73.79 | 12 | 4d ago | |
| WMT17 (test) | ParaBLEU | Kendall Tau0.653 | 12 | 4d ago | |
| WMT 2019 (test) | BARTSCORE-PROMPT | de-en0.238 | 10 | 4d ago | |
| WMT Domain 21 | Correlation0.65 | 5 | 4d ago | ||
| WMT De→En Top 30% 2019 | Pearson Correlation (|r|)0.883 | 5 | 4d ago | ||
| WMT De→En 2019 (All) | DA-BERTScore | Pearson Correlation (|r|)0.951 | 5 | 4d ago | |
| WMT En→De 2019 (Top 30%) | DA-BERTScore | Pearson Correlation (|r|)0.974 | 5 | 4d ago | |
| WMT En→De 2019 (all) | DA-BERTScore | |r|0.991 | 5 | 4d ago |