Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Soft Language Clustering for Multilingual Model Pre-training

About

Multilingual pre-trained language models have demonstrated impressive (zero-shot) cross-lingual transfer abilities, however, their performance is hindered when the target language has distant typology from source languages or when pre-training data is limited in size. In this paper, we propose XLM-P, which contextually retrieves prompts as flexible guidance for encoding instances conditionally. Our XLM-P enables (1) lightweight modeling of language-invariant and language-specific knowledge across languages, and (2) easy integration with other multilingual pre-training methods. On the tasks of XTREME including text classification, sequence labeling, question answering, and sentence retrieval, both base- and large-size language models pre-trained with our proposed method exhibit consistent performance improvement. Furthermore, it provides substantial advantages for low-resource languages in unsupervised sentence retrieval and for target languages that differ greatly from the source language in cross-lingual transfer.

Jiali Zeng, Yufan Jiang, Yongjing Yin, Yi Jing, Fandong Meng, Binghuai Lin, Yunbo Cao, Jie Zhou• 2023

Related benchmarks

TaskDatasetResultRank
Cross-lingual Language UnderstandingXTREME
XNLI Accuracy81.2
38
Cross-lingual sentence retrievalTatoeba Parallel 14 language pairs
Accuracy77.2
14
Cross-lingual sentence retrieval (en → xx)Tatoeba-36
Accuracy@176.4
11
Cross-lingual sentence retrieval (xx → en)Tatoeba-36
Average Accuracy@169
11
Showing 4 of 4 rows

Other info

Code

Follow for update