Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Dense Information Flow for Neural Machine Translation

About

Recently, neural machine translation has achieved remarkable progress by introducing well-designed deep neural networks into its encoder-decoder framework. From the optimization perspective, residual connections are adopted to improve learning performance for both encoder and decoder in most of these deep architectures, and advanced attention connections are applied as well. Inspired by the success of the DenseNet model in computer vision problems, in this paper, we propose a densely connected NMT architecture (DenseNMT) that is able to train more efficiently for NMT. The proposed DenseNMT not only allows dense connection in creating new features for both encoder and decoder, but also uses the dense attention structure to improve attention quality. Our experiments on multiple datasets show that DenseNMT structure is more competitive and efficient.

Yanyao Shen, Xu Tan, Di He, Tao Qin, Tie-Yan Liu• 2018

Related benchmarks

TaskDatasetResultRank
Machine TranslationWMT En-De 2014 (test)
BLEU25.5
379
Machine TranslationWMT English-German 2014 (test)
BLEU25.52
136
Machine TranslationIWSLT German-to-English '14 (test)
BLEU Score32.26
110
Machine TranslationIWSLT Turkish-English (tst2011)
BLEU23.33
10
Machine TranslationIWSLT Turkish-English 2012 (test)
BLEU24.65
10
Machine TranslationIWSLT Turkish-English (tst2013)
BLEU24.92
10
Machine TranslationIWSLT Turkish-English (tst2014)
BLEU24.54
10
Machine TranslationTurkish-English
BLEU Score24.36
3
Showing 8 of 8 rows

Other info

Code

Follow for update