Share your thoughts, 1 month free Claude Pro on usSee more

Span-level machine translation error detection on MQM EN-ZH annotations 2024 (test)

35.95Precision

MQM #2

Updated 4mo ago

Evaluation Results

Method	Links
MQM #2 2026.03		35.95	25.68	29.96
Sonnet 4.5 2026.03		34.31	21.57	26.49
MQM #1 2026.03		29.88	25.49	27.51
Haiku 4.5 2026.03		29.88	15.15	20.11
gpt-oss 120b 2026.03		28.89	21.42	24.6
Qwen3 235b 2026.03		28.88	30.92	29.87