Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Improving the Transformer Translation Model with Document-Level Context

About

Although the Transformer translation model (Vaswani et al., 2017) has achieved state-of-the-art performance in a variety of translation tasks, how to use document-level context to deal with discourse phenomena problematic for Transformer still remains a challenge. In this work, we extend the Transformer model with a new context encoder to represent document-level context, which is then incorporated into the original encoder and decoder. As large-scale document-level parallel corpora are usually not available, we introduce a two-step training method to take full advantage of abundant sentence-level parallel corpora and limited document-level parallel corpora. Experiments on the NIST Chinese-English datasets and the IWSLT French-English datasets show that our approach improves over Transformer significantly.

Jiacheng Zhang, Huanbo Luan, Maosong Sun, FeiFei Zhai, Jingfang Xu, Min Zhang, Yang Liu• 2018

Related benchmarks

TaskDatasetResultRank
Long-form Question AnsweringELI5--
32
Machine TranslationBConTrasT De=>En (test)
BLEU60.08
28
Machine TranslationBMELD Ch=>En (test)
BLEU22.26
28
Machine TranslationBMELD (En=>Ch) (test)
BLEU27.13
28
En-De Chat TranslationBConTrasT (test)
BLEU58.94
16
Document-Level Machine TranslationIWSLT Fr-En 2010 (test)
BLEU36.85
15
Machine TranslationNIST Zh-En sacreBLEU (test)
sacreBLEU47.28
6
Machine TranslationIWSLT En-De sacreBLEU (test)
sacreBLEU28.74
6
Knowledge Grounded DialogueWizards of Wikipedia
F1 Score34.61
6
Showing 9 of 9 rows

Other info

Follow for update