Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Controlling Language Confusion in Multilingual LLMs

About

Large language models often suffer from language confusion, a phenomenon in which responses are partially or entirely generated in unintended languages. This critically degrades the user experience, especially in low-resource settings. We hypothesize that this issue stems from limitations in conventional fine-tuning objectives, such as supervised learning, which optimize the likelihood of correct tokens without explicitly penalizing undesired outputs such as cross-lingual mixing. Analysis of loss trajectories during pretraining further reveals that models fail to distinguish between monolingual and language-mixed texts, highlighting the absence of inherent pressure to avoid such confusion. In this work, we apply ORPO, which adds penalties for unwanted output styles to standard SFT, effectively suppressing language-confused generations. ORPO maintains strong language consistency, even under high decoding temperatures, while preserving general QA performance. Our findings suggest that incorporating appropriate penalty terms can effectively mitigate language confusion in multilingual models, particularly in low-resource scenarios.

Nahyun Lee, Yeongseo Woo, Hyunwoo Ko, Guijin Son• 2025

Related benchmarks

TaskDatasetResultRank
ReasoningBBH
Accuracy49.22
726
Multitask Language UnderstandingMMLU
Accuracy51.58
520
Graduate-level Question AnsweringGPQA
Accuracy30.52
215
Math Word Problem SolvingGSM8K
Accuracy78.09
158
Mathematical Problem SolvingMATH
Accuracy43.78
75
Instruction FollowingMIF en
Accuracy66
10
Instruction FollowingMIF (target)
Accuracy43.82
10
Multitask Language UnderstandingMMMLU (target)
RPR57.86
5
Science Q&AARC-C en
Accuracy82.99
5
Science Question AnsweringGPQA EN
Accuracy31.52
5
Showing 10 of 27 rows

Other info

Follow for update