XLM-E: Cross-lingual Language Model Pre-training via ELECTRA
About
In this paper, we introduce ELECTRA-style tasks to cross-lingual language model pre-training. Specifically, we present two pre-training tasks, namely multilingual replaced token detection, and translation replaced token detection. Besides, we pretrain the model, named as XLM-E, on both multilingual and parallel corpora. Our model outperforms the baseline models on various cross-lingual understanding tasks with much less computation cost. Moreover, analysis shows that XLM-E tends to obtain better cross-lingual transferability.
Zewen Chi, Shaohan Huang, Li Dong, Shuming Ma, Bo Zheng, Saksham Singhal, Payal Bajaj, Xia Song, Xian-Ling Mao, Heyan Huang, Furu Wei• 2021
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Natural Language Inference | XNLI (test) | Average Accuracy83.7 | 167 | |
| Cross-lingual Language Understanding | XTREME | XNLI Accuracy76.6 | 38 | |
| Question Answering | MLQA (test) | F1 Score76.2 | 35 | |
| Cross-lingual sentence retrieval | Tatoeba Parallel 14 language pairs | Accuracy80.6 | 14 | |
| Word Alignment | EuroParl en-de, en-fr, en-hi, en-ro WPT2003, WPT2005 | AER (en-de)16.49 | 12 | |
| Cross-lingual sentence retrieval (en → xx) | Tatoeba-36 | Accuracy@168.6 | 11 | |
| Cross-lingual sentence retrieval (xx → en) | Tatoeba-36 | Average Accuracy@167.3 | 11 | |
| Cross-lingual Transfer | XTREME (test) | MLQA19.2 | 6 |
Showing 8 of 8 rows