A Character-Level Decoder without Explicit Segmentation for Neural Machine Translation
About
The existing machine translation systems, whether phrase-based or neural, have relied almost exclusively on word-level modelling with explicit segmentation. In this paper, we ask a fundamental question: can neural machine translation generate a character sequence without any explicit segmentation? To answer this question, we evaluate an attention-based encoder-decoder with a subword-level encoder and a character-level decoder on four language pairs--En-Cs, En-De, En-Ru and En-Fi-- using the parallel corpora from WMT'15. Our experiments show that the models with a character-level decoder outperform the ones with a subword-level decoder on all of the four language pairs. Furthermore, the ensembles of neural models with a character-level decoder outperform the state-of-the-art non-neural machine translation systems on En-Cs, En-De and En-Fi and perform comparably on En-Ru.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Machine Translation | WMT14 En-De newstest2014 (test) | BLEU23.04 | 65 | |
| Machine Translation | WMT En-De (newstest2014) | BLEU21.33 | 43 | |
| Machine Translation | WMT newstest 2015 (test) | BLEU23.45 | 31 | |
| Machine Translation | newstest En-De 2015 (test) | BLEU25.44 | 11 | |
| Machine Translation | newstest En-De 2013 (dev) | BLEU23.05 | 10 | |
| Machine Translation | newstest En-Cs 2014 (test) | BLEU22.15 | 7 | |
| Machine Translation | newstest En-Cs 2015 (test2) | BLEU18.93 | 7 | |
| Machine Translation | newstest En-Ru 2014 (test) | BLEU29.37 | 7 | |
| Machine Translation | newstest En-Fi 2015 (test) | BLEU13.48 | 7 | |
| Machine Translation | newstest En-Ru 2015 (test) | BLEU23.75 | 7 |