Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

InfoXLM: An Information-Theoretic Framework for Cross-Lingual Language Model Pre-Training

About

In this work, we present an information-theoretic framework that formulates cross-lingual language model pre-training as maximizing mutual information between multilingual-multi-granularity texts. The unified view helps us to better understand the existing methods for learning cross-lingual representations. More importantly, inspired by the framework, we propose a new pre-training task based on contrastive learning. Specifically, we regard a bilingual sentence pair as two views of the same meaning and encourage their encoded representations to be more similar than the negative examples. By leveraging both monolingual and parallel corpora, we jointly train the pretext tasks to improve the cross-lingual transferability of pre-trained models. Experimental results on several benchmarks show that our approach achieves considerably better performance. The code and pre-trained models are available at https://aka.ms/infoxlm.

Zewen Chi, Li Dong, Furu Wei, Nan Yang, Saksham Singhal, Wenhui Wang, Xia Song, Xian-Ling Mao, Heyan Huang, Ming Zhou• 2020

Related benchmarks

TaskDatasetResultRank
Natural Language InferenceXNLI (test)
Average Accuracy81.4
167
Cross-lingual Language UnderstandingXTREME
XNLI Accuracy81.4
38
Natural Language InferenceXNLI 1.0 (test)
Accuracy81.4
38
Question AnsweringMLQA (test)
F1 Score73.6
35
Cross-lingual Question AnsweringMLQA v1.0 (test)
F1 (es)75.1
34
Semantic Entity RecognitionFUNSD
EN Score68.52
31
Relation ExtractionFUNSD
EN Performance Score36.99
16
Cross-lingual sentence retrievalTatoeba Parallel 14 language pairs--
14
Relation ExtractionXFUND v1.0 (test)
FUNSD Score0.3679
12
Semantic Entity RecognitionXFUND
Accuracy (ZH)88.68
12
Showing 10 of 20 rows

Other info

Code

Follow for update