Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Self-Distillation Bridges Distribution Gap in Language Model Fine-Tuning

About

The surge in Large Language Models (LLMs) has revolutionized natural language processing, but fine-tuning them for specific tasks often encounters challenges in balancing performance and preserving general instruction-following abilities. In this paper, we posit that the distribution gap between task datasets and the LLMs serves as the primary underlying cause. To address the problem, we introduce Self-Distillation Fine-Tuning (SDFT), a novel approach that bridges the distribution gap by guiding fine-tuning with a distilled dataset generated by the model itself to match its original distribution. Experimental results on the Llama-2-chat model across various benchmarks demonstrate that SDFT effectively mitigates catastrophic forgetting while achieving comparable or superior performance on downstream tasks compared to the vanilla fine-tuning. Moreover, SDFT demonstrates the potential to maintain the helpfulness and safety alignment of LLMs. Our code is available at https://github.com/sail-sg/sdft.

Zhaorui Yang, Tianyu Pang, Haozhe Feng, Han Wang, Wei Chen, Minfeng Zhu, Qian Liu• 2024

Related benchmarks

TaskDatasetResultRank
Mathematical ReasoningGSM8K
Accuracy34.4
1362
Code GenerationHumanEval
Pass@118.3
1036
Question AnsweringARC Challenge
Accuracy82.42
906
Mathematical ReasoningMATH
Accuracy7.34
882
Multi-task Language UnderstandingMMLU
Accuracy57.67
876
Language UnderstandingMMLU
Accuracy84.13
825
ReasoningBBH
Accuracy71.01
672
Mathematical ReasoningSVAMP
Accuracy18.2
403
Question AnsweringARC Easy
Normalized Acc90.91
389
Reading ComprehensionRACE high
Accuracy74.59
295
Showing 10 of 29 rows

Other info

Code

Follow for update