Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Unbabel's Participation in the WMT20 Metrics Shared Task

About

We present the contribution of the Unbabel team to the WMT 2020 Shared Task on Metrics. We intend to participate on the segment-level, document-level and system-level tracks on all language pairs, as well as the 'QE as a Metric' track. Accordingly, we illustrate results of our models in these tracks with reference to test sets from the previous year. Our submissions build upon the recently proposed COMET framework: We train several estimator models to regress on different human-generated quality scores and a novel ranking model trained on relative ranks obtained from Direct Assessments. We also propose a simple technique for converting segment-level predictions into a document-level score. Overall, our systems achieve strong results for all language pairs on previous test sets and in many cases set a new state-of-the-art.

Ricardo Rei, Craig Stewart, Catarina Farinha, Alon Lavie• 2020

Related benchmarks

TaskDatasetResultRank
Hallucination DetectionGerman-English MT All hallucinations
AUC70.2
8
Hallucination DetectionGerman-English MT Fully detached hallucinations
AUC0.661
8
Machine Translation Hallucination MitigationGerman-English MT Hallucination Subsets
F. Score59
6
Machine Translation RerankingMT Hallucination German-English (test)
COMET (F)-0.21
6
Showing 4 of 4 rows

Other info

Follow for update