Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Machine Translation Meta-evaluation on WMT En-De Metrics Shared Task (System-Level) 2023 (test)
Loading...
98.5
Accuracy
RATE
89.036
91.493
93.95
96.407
Jan 12, 2026
Accuracy
Pearson Correlation Coefficient (r)
Updated 3mo ago
Evaluation Results
Method
Method
Links
Accuracy
Pearson Correlation Coefficient (r)
RATE
Type=LLM-as-a-Judge
2026.01
98.5
99
COMET
Type=Reference-based M...
2026.01
97
99
GEMBA-MQM
Type=LLM-as-a-Judge
2026.01
97
97.3
M-MAD
Type=LLM-as-a-Judge
2026.01
97
97.9
EAPrompt
Type=LLM-as-a-Judge
2026.01
93.9
96.2
BLEU
Type=Reference-based M...
2026.01
89.4
91.7
Feedback
Search any
task
Search any
task