Non-Autoregressive Machine Translation with Latent Alignments

About

This paper presents two strong methods, CTC and Imputer, for non-autoregressive machine translation that model latent alignments with dynamic programming. We revisit CTC for machine translation and demonstrate that a simple CTC model can achieve state-of-the-art for single-step non-autoregressive machine translation, contrary to what prior work indicates. In addition, we adapt the Imputer model for non-autoregressive machine translation and demonstrate that Imputer with just 4 generation steps can match the performance of an autoregressive Transformer baseline. Our latent alignment models are simpler than many existing non-autoregressive translation baselines; for example, we do not require target length prediction or re-scoring with an autoregressive model. On the competitive WMT'14 En$\rightarrow$De task, our CTC model achieves 25.7 BLEU with a single generation step, while Imputer achieves 27.5 BLEU with 2 generation steps, and 28.0 BLEU with 4 generation steps. This compares favourably to the autoregressive Transformer baseline at 27.8 BLEU.

Chitwan Saharia, William Chan, Saurabh Saxena, Mohammad Norouzi• 2020

Related benchmarks

Task	Dataset	Result
Machine Translation	WMT En-De 2014 (test)	BLEU25.8	379
Machine Translation	WMT Ro-En 2016 (test)	BLEU31.7	84
Machine Translation	WMT14 En-De newstest2014 (test)	BLEU25.8	65
Machine Translation	WMT De-En 14 (test)	BLEU28.4	59
Machine Translation	WMT16 EN-RO (test)	BLEU32.3	56
Machine Translation	WMT En-Ro 2016 (test)	BLEU33.28	39
Machine Translation	WMT14 DE-EN (test)	BLEU28.4	28
Machine Translation	WMT16 Ro-En (test)	BLEU31.7	27
Machine Translation	WMT'16 En-Ro (test)	BLEU32.3	18
Machine Translation	WMT De⇒En 2021 (test)	BLEU30.52	12

Showing 10 of 11 rows

Other info

Follow for update

@wizwand_team Discord