Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

SASFT: Sparse Autoencoder-guided Supervised Finetuning to Mitigate Unexpected Code-Switching in LLMs

About

Large Language Models (LLMs) have impressive multilingual capabilities, but they suffer from unexpected code-switching, also known as language mixing, which involves switching to unexpected languages in the model response. This problem leads to poor readability and degrades the usability of model responses. However, existing work on this issue lacks a mechanistic analysis and shows limited effectiveness. In this paper, we first provide an in-depth analysis of unexpected code-switching using sparse autoencoders and find that when LLMs switch to a language, the features of that language exhibit excessive pre-activation values. Based on our findings, we propose $\textbf{S}$parse $\textbf{A}$utoencoder-guided $\textbf{S}$upervised $\textbf{F}$ine$\textbf{t}$uning (SASFT), which teaches LLMs to maintain appropriate pre-activation values of specific language features during training. Experiments on five models across three languages demonstrate that SASFT consistently reduces unexpected code-switching by more than 50\% compared to standard supervised fine-tuning, with complete elimination in one case. Moreover, SASFT maintains or even improves the models' performance on six multilingual benchmarks, showing its effectiveness in addressing code-switching while preserving multilingual capabilities. The code and data are available at https://github.com/Aatrox103/SASFT.

Boyi Deng, Yu Wan, Baosong Yang, Fei Huang, Wenjie Wang, Fuli Feng• 2025

Related benchmarks

TaskDatasetResultRank
Commonsense ReasoningHellaSwag
Accuracy45.68
1891
Instruction FollowingIFEval
IFEval Accuracy35.27
625
Code GenerationHumanEval (test)--
506
Multitask Language UnderstandingMMLU
Accuracy50.28
413
General KnowledgeMMLU
MMLU General Knowledge Accuracy51.36
234
Mathematical ReasoningMGSM
Accuracy63.92
194
Logical reasoningLogiQA
LogiQA Accuracy44.5
181
Logical reasoningLogiQA (test)
Accuracy42.62
151
Massive Multitask Language UnderstandingMMLU
Accuracy52.88
117
Language UnderstandingMMLU
MMLU Score49.6
98
Showing 10 of 20 rows

Other info

Follow for update