Neural Machine Translation by Jointly Learning to Align and Translate

About

Neural machine translation is a recently proposed approach to machine translation. Unlike the traditional statistical machine translation, the neural machine translation aims at building a single neural network that can be jointly tuned to maximize the translation performance. The models proposed recently for neural machine translation often belong to a family of encoder-decoders and consists of an encoder that encodes a source sentence into a fixed-length vector from which a decoder generates a translation. In this paper, we conjecture that the use of a fixed-length vector is a bottleneck in improving the performance of this basic encoder-decoder architecture, and propose to extend this by allowing a model to automatically (soft-)search for parts of a source sentence that are relevant to predicting a target word, without having to form these parts as a hard segment explicitly. With this new approach, we achieve a translation performance comparable to the existing state-of-the-art phrase-based system on the task of English-to-French translation. Furthermore, qualitative analysis reveals that the (soft-)alignments found by the model agree well with our intuition.

Dzmitry Bahdanau, Kyunghyun Cho, Yoshua Bengio• 2014

Related benchmarks

Task	Dataset	Result
Multivariate Forecasting	ETTh1	MSE0.991	830
Multivariate Time-series Forecasting	ETTm1	MSE0.444	686
Multivariate long-term series forecasting	ETTh2	MSE1.552	445
Long-term time-series forecasting	ETTh1 (test)	MSE0.114	410
Hallucination Detection	TriviaQA (test)	AUC-ROC42	243
Machine Translation	WMT En-Fr 2014 (test)	BLEU28.45	237
Machine Translation	IWSLT De-En 2014 (test)	BLEU29.98	146
Multimodal Machine Translation	Multi30K (test)	BLEU-433.7	139
Speech Recognition	WSJ (92-eval)	WER16	131
Scene Text Recognition	SVT 647 (test)	Accuracy85.9	101

Showing 10 of 133 rows

...

Other info

Code

Follow for update

@wizwand_team Discord