XLM-E: Cross-lingual Language Model Pre-training via ELECTRA

About

In this paper, we introduce ELECTRA-style tasks to cross-lingual language model pre-training. Specifically, we present two pre-training tasks, namely multilingual replaced token detection, and translation replaced token detection. Besides, we pretrain the model, named as XLM-E, on both multilingual and parallel corpora. Our model outperforms the baseline models on various cross-lingual understanding tasks with much less computation cost. Moreover, analysis shows that XLM-E tends to obtain better cross-lingual transferability.

Zewen Chi, Shaohan Huang, Li Dong, Shuming Ma, Bo Zheng, Saksham Singhal, Payal Bajaj, Xia Song, Xian-Ling Mao, Heyan Huang, Furu Wei• 2021

Related benchmarks

Task	Dataset	Result
Natural Language Inference	XNLI (test)	Average Accuracy83.7	167
Cross-lingual Language Understanding	XTREME	XNLI Accuracy76.6	43
Question Answering	MLQA (test)	F1 Score76.2	35
Cross-lingual sentence retrieval	Tatoeba Parallel 14 language pairs	Accuracy80.6	14
Word Alignment	EuroParl en-de, en-fr, en-hi, en-ro WPT2003, WPT2005	AER (en-de)16.49	12
Cross-lingual sentence retrieval (en → xx)	Tatoeba-36	Accuracy@168.6	11
Cross-lingual sentence retrieval (xx → en)	Tatoeba-36	Average Accuracy@167.3	11
Cross-lingual Transfer	XTREME (test)	MLQA19.2	6

Showing 8 of 8 rows

Other info

Code

Follow for update

@wizwand_team Discord