Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Over-Generation Cannot Be Rewarded: Length-Adaptive Average Lagging for Simultaneous Speech Translation

About

Simultaneous speech translation (SimulST) systems aim at generating their output with the lowest possible latency, which is normally computed in terms of Average Lagging (AL). In this paper we highlight that, despite its widespread adoption, AL provides underestimated scores for systems that generate longer predictions compared to the corresponding references. We also show that this problem has practical relevance, as recent SimulST systems have indeed a tendency to over-generate. As a solution, we propose LAAL (Length-Adaptive Average Lagging), a modified version of the metric that takes into account the over-generation phenomenon and allows for unbiased evaluation of both under-/over-generating systems.

Sara Papi, Marco Gaido, Matteo Negri, Marco Turchi• 2022

Related benchmarks

TaskDatasetResultRank
Latency Metric EvaluationIWSLT tst-COMMON (w/o degenerate simultaneous policy) 2022 2023--
2
Latency Metric EvaluationIWSLT En-De tst-COMMON w/o degenerate 2022/2023--
2
Latency Metric Accuracy EvaluationLong-form SimulST All language pairs--
1
Latency Metric EvaluationIWSLT tst-COMMON All system pairs 2022 2023 (All)--
1
Latency Metric EvaluationIWSLT tst-COMMON En-Zh w/o degenerate 2022 2023--
1
Latency Metric EvaluationIWSLT tst-COMMON 2022/2023 (Same Team w/o degenerate)--
1
Showing 6 of 6 rows

Other info

Follow for update