Diversity in Large Language Models under Supervised Fine-Tuning

About

Supervised Fine-Tuning (SFT) is essential for aligning Large Language Models (LLMs) with user intent, yet it is believed to suppress generative diversity. Although this reduction is frequently referenced, formal empirical testing of the phenomenon remains limited. The expressiveness of LLMs by itself was addressed by multiple prior methods. Their varying perspectives suggest that deeper investigation could yield further improvements. In this study, we attribute the decline to two primary drivers: the neglect of low-frequency patterns within fine-tuning datasets and the forgetting of preexisting knowledge. Motivated by our theoretical analysis, we develop Tempered Focal (TOFU) loss, a novel objective that addresses both stated challenges simultaneously. Our extensive evaluation confirms at scale that generation breadth narrows after SFT and strengthens the hypothesis explaining this effect. Across multiple models and benchmarks, we demonstrate that TOFU enhances output diversity while preserving high response quality, offering a principled approach to SFT.

Roman Klypa, Oleksandr Cherednichenko• 2026

Related benchmarks

Task	Dataset	Result
Multi-task Language Understanding	MMLU	MMLU Accuracy75.2	456
Reasoning	ARC	Accuracy89	269
Safety Evaluation	HarmBench	ASR79.2	153
Text Generation	NOVELTYBENCH	Diversity9.5	81
Safety Evaluation	Malicious Instruct	ASR52.4	44
Creative Writing	Alpaca SFT Short Stories	Self-BLEU (Diversity)10.8	36
Instruction Following	Alpaca SFT Small Prompts	Self-BLEU29.3	36
Math Reasoning	MATH500	Coverage86	9
Math Reasoning	Minerva	Coverage39.3	9
Math Reasoning	GSM8K	Coverage88	9

Showing 10 of 10 rows

Other info

Follow for update

@wizwand_team Discord