Thinking Slow about Latency Evaluation for Simultaneous Machine Translation

About

Simultaneous machine translation attempts to translate a source sentence before it is finished being spoken, with applications to translation of spoken language for live streaming and conversation. Since simultaneous systems trade quality to reduce latency, having an effective and interpretable latency metric is crucial. We introduce a variant of the recently proposed Average Lagging (AL) metric, which we call Differentiable Average Lagging (DAL). It distinguishes itself by being differentiable and internally consistent to its underlying mathematical model.

Colin Cherry, George Foster• 2019

Related benchmarks

Task	Dataset	Result
Latency Metric Evaluation	IWSLT tst-COMMON (w/o degenerate simultaneous policy) 2022 2023	--	2
Latency Metric Evaluation	IWSLT En-De tst-COMMON w/o degenerate 2022/2023	--	2
Latency Metric Accuracy Evaluation	Long-form SimulST All language pairs	--	1
Latency Metric Evaluation	IWSLT tst-COMMON All system pairs 2022 2023 (All)	--	1
Latency Metric Evaluation	IWSLT tst-COMMON En-Zh w/o degenerate 2022 2023	--	1
Latency Metric Evaluation	IWSLT tst-COMMON 2022/2023 (Same Team w/o degenerate)	--	1

Showing 6 of 6 rows

Other info

Follow for update

@wizwand_team Discord