Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Generative Imagination Elevates Machine Translation

About

There are common semantics shared across text and images. Given a sentence in a source language, whether depicting the visual scene helps translation into a target language? Existing multimodal neural machine translation methods (MNMT) require triplets of bilingual sentence - image for training and tuples of source sentence - image for inference. In this paper, we propose ImagiT, a novel machine translation method via visual imagination. ImagiT first learns to generate visual representation from the source sentence, and then utilizes both source sentence and the "imagined representation" to produce a target translation. Unlike previous methods, it only needs the source sentence at the inference time. Experiments demonstrate that ImagiT benefits from visual imagination and significantly outperforms the text-only neural machine translation baselines. Further analysis reveals that the imagination process in ImagiT helps fill in missing information when performing the degradation strategy.

Quanyu Long, Mingxuan Wang, Lei Li• 2020

Related benchmarks

TaskDatasetResultRank
Multimodal Machine TranslationMulti30K (test)
BLEU-459.9
139
Multimodal Machine Translation (English-German)Multi30K 2016 (test)
BLEU38.5
52
Multimodal Machine TranslationMulti30k En-De 2017 (test)
METEOR52.4
45
Multimodal Machine TranslationMulti30k En-Fr 2017 (test)
METEOR68.3
31
Machine TranslationMulti30k En→Fr v1 2017 (test)
BLEU52.4
30
Multimodal Machine TranslationMulti30k En-Fr 2016 (test)
METEOR Score74
30
Machine Translation (En-Fr)Multi30K 2016 (test)
METEOR74
18
Multimodal Machine TranslationMSCOCO Ambiguous EN-DE (test)
BLEU28.7
13
Machine Translation (En-De)Multi30K MSCOCO
BLEU28.7
12
Machine Translation (En-Fr)Multi30K MSCOCO
BLEU45.3
9
Showing 10 of 12 rows

Other info

Follow for update