Non-Autoregressive Machine Translation with Latent Alignments
About
This paper presents two strong methods, CTC and Imputer, for non-autoregressive machine translation that model latent alignments with dynamic programming. We revisit CTC for machine translation and demonstrate that a simple CTC model can achieve state-of-the-art for single-step non-autoregressive machine translation, contrary to what prior work indicates. In addition, we adapt the Imputer model for non-autoregressive machine translation and demonstrate that Imputer with just 4 generation steps can match the performance of an autoregressive Transformer baseline. Our latent alignment models are simpler than many existing non-autoregressive translation baselines; for example, we do not require target length prediction or re-scoring with an autoregressive model. On the competitive WMT'14 En$\rightarrow$De task, our CTC model achieves 25.7 BLEU with a single generation step, while Imputer achieves 27.5 BLEU with 2 generation steps, and 28.0 BLEU with 4 generation steps. This compares favourably to the autoregressive Transformer baseline at 27.8 BLEU.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Machine Translation | WMT En-De 2014 (test) | BLEU25.8 | 379 | |
| Machine Translation | WMT Ro-En 2016 (test) | BLEU31.7 | 82 | |
| Machine Translation | WMT14 En-De newstest2014 (test) | BLEU25.8 | 65 | |
| Machine Translation | WMT De-En 14 (test) | BLEU28.4 | 59 | |
| Machine Translation | WMT16 EN-RO (test) | BLEU32.3 | 56 | |
| Machine Translation | WMT En-Ro 2016 (test) | BLEU33.28 | 39 | |
| Machine Translation | WMT14 DE-EN (test) | BLEU28.4 | 28 | |
| Machine Translation | WMT16 Ro-En (test) | BLEU31.7 | 27 | |
| Machine Translation | WMT'16 En-Ro (test) | BLEU32.3 | 18 | |
| Machine Translation | WMT De⇒En 2021 (test) | BLEU30.52 | 12 |