End-to-End Non-Autoregressive Neural Machine Translation with Connectionist Temporal Classification
About
Autoregressive decoding is the only part of sequence-to-sequence models that prevents them from massive parallelization at inference time. Non-autoregressive models enable the decoder to generate all output symbols independently in parallel. We present a novel non-autoregressive architecture based on connectionist temporal classification and evaluate it on the task of neural machine translation. Unlike other non-autoregressive methods which operate in several steps, our model can be trained end-to-end. We conduct experiments on the WMT English-Romanian and English-German datasets. Our models achieve a significant speedup over the autoregressive models, keeping the translation quality comparable to other non-autoregressive models.
Jind\v{r}ich Libovick\'y, Jind\v{r}ich Helcl• 2018
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Machine Translation | WMT En-De 2014 (test) | BLEU16.56 | 379 | |
| Machine Translation | WMT En-De '14 | BLEU17.68 | 89 | |
| Machine Translation | WMT Ro-En 2016 (test) | BLEU24.67 | 82 | |
| Machine Translation | WMT14 En-De newstest2014 (test) | BLEU16.56 | 65 | |
| Machine Translation | WMT De-En 14 (test) | BLEU18.64 | 59 | |
| Machine Translation | WMT16 EN-RO (test) | BLEU19.54 | 56 | |
| Machine Translation | WMT De-En 14 | BLEU19.8 | 33 | |
| Machine Translation | WMT Ro-En '16 | BLEU Score24.71 | 28 | |
| Machine Translation | WMT EN-RO 2016 | BLEU19.93 | 28 | |
| Machine Translation | WMT14 DE-EN (test) | BLEU18.64 | 28 |
Showing 10 of 10 rows