Thinking Slow about Latency Evaluation for Simultaneous Machine Translation
About
Simultaneous machine translation attempts to translate a source sentence before it is finished being spoken, with applications to translation of spoken language for live streaming and conversation. Since simultaneous systems trade quality to reduce latency, having an effective and interpretable latency metric is crucial. We introduce a variant of the recently proposed Average Lagging (AL) metric, which we call Differentiable Average Lagging (DAL). It distinguishes itself by being differentiable and internally consistent to its underlying mathematical model.
Colin Cherry, George Foster• 2019
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Latency Metric Evaluation | IWSLT tst-COMMON (w/o degenerate simultaneous policy) 2022 2023 | -- | 2 | |
| Latency Metric Evaluation | IWSLT En-De tst-COMMON w/o degenerate 2022/2023 | -- | 2 | |
| Latency Metric Accuracy Evaluation | Long-form SimulST All language pairs | -- | 1 | |
| Latency Metric Evaluation | IWSLT tst-COMMON All system pairs 2022 2023 (All) | -- | 1 | |
| Latency Metric Evaluation | IWSLT tst-COMMON En-Zh w/o degenerate 2022 2023 | -- | 1 | |
| Latency Metric Evaluation | IWSLT tst-COMMON 2022/2023 (Same Team w/o degenerate) | -- | 1 |
Showing 6 of 6 rows