Dynamic Meta-Metrics: Source-Sentence Conditioned Weighting for MT Evaluation

About

We propose Dynamic Meta-Metrics (DMM), a framework for machine translation evaluation that learns source-sentence conditioned combinations of existing metrics. Rather than relying on a single static ensemble or language-specific weighting, DMM adapts the metric combination based on properties of the source segment. We study hard conditioning, which fits an interpretable combiner per cluster, and an exploratory soft-conditioned extension whose weights vary continuously with source-cluster responsibilities. We evaluate DMM on the WMT Metrics Shared Task data across multiple language pairs using pairwise agreement measures at the system and segment levels. Across settings, MLP-based combinations outperform linear and Gaussian process-based ensembles, and introducing soft conditioning yields gains over linear models.

Luke Zhang, Justin Vasselli, Aditya Khan, York Hay Ng, En-Shiun Annie Lee• 2026

Related benchmarks

Task	Dataset	Result
Machine Translation Meta-evaluation	WMT EN-CS 2025	Acc*Eq61.4	17
Machine Translation Meta-evaluation	WMT EN-ZH 2025	Acc*Eq56.8	17
Machine Translation Meta-evaluation	WMT EN-JA 2025	Acc*Eq57.3	17
Machine Translation Meta-evaluation	WMT EN-UK 2025	Acc*Eq0.557	17

Showing 4 of 4 rows

Other info

Follow for update

@wizwand_team Discord