Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Multilingual LLMs are Better Cross-lingual In-context Learners with Alignment

About

In-context learning (ICL) unfolds as large language models become capable of inferring test labels conditioned on a few labeled samples without any gradient update. ICL-enabled large language models provide a promising step forward toward bypassing recurrent annotation costs in a low-resource setting. Yet, only a handful of past studies have explored ICL in a cross-lingual setting, in which the need for transferring label-knowledge from a high-resource language to a low-resource one is immensely crucial. To bridge the gap, we provide the first in-depth analysis of ICL for cross-lingual text classification. We find that the prevalent mode of selecting random input-label pairs to construct the prompt-context is severely limited in the case of cross-lingual ICL, primarily due to the lack of alignment in the input as well as the output spaces. To mitigate this, we propose a novel prompt construction strategy -- Cross-lingual In-context Source-Target Alignment (X-InSTA). With an injected coherence in the semantics of the input examples and a task-based alignment across the source and target languages, X-InSTA is able to outperform random prompt selection by a large margin across three different tasks using 44 different cross-lingual pairs.

Eshaan Tanwar, Subhabrata Dutta, Manish Borthakur, Tanmoy Chakraborty• 2023

Related benchmarks

TaskDatasetResultRank
Sentiment ClassificationMultilingual Amazon Reviews Corpus (MARC) English (en) (test)
Macro F10.857
24
Sentiment ClassificationMultilingual Amazon Reviews Corpus (MARC) Spanish (es) (test)
Macro-F190.6
24
Sentiment ClassificationMultilingual Amazon Reviews Corpus (MARC) French (fr) (test)
Macro F187.5
24
Sentiment ClassificationMultilingual Amazon Reviews Corpus (MARC) Japanese (ja) (test)
Macro F185.1
24
Sentiment ClassificationMultilingual Amazon Reviews Corpus (MARC) German (de) (test)
Macro F138.2
24
Sentiment ClassificationMultilingual Amazon Reviews Corpus (MARC) Chinese (zh) (test)
Macro F134.8
24
Sentiment ClassificationCLS
Accuracy (de)58.8
16
Hate Speech DetectionHatEval English
Macro F126.9
8
Hate Speech DetectionHatEval Spanish
Macro F10.542
4
Showing 9 of 9 rows

Other info

Code

Follow for update