PRISM: Preference-Aware Influence Function Based Data Selection Method for Efficient Fine-Tuning

About

As LLMs continue to scale up, improving training efficiency heavily relies on effective data utilization. Data selection mitigates this issue by allocating the limited training budget to high-value examples that optimally facilitate the model's target behavior. Most existing approaches define target behavior via a set of target examples and score candidate training data based on their estimated influence on these samples. However, such methods uniformly treat all target examples as equally important, ignoring the varying relevance of individual examples to model optimization. Specifically, target examples that align closely with the model's inherent behavior deliver stronger supervisory signals, whereas discrepant examples yield only weak and ineffective local guidance. We propose PRISM, a Preference-aware Influence function based Data Selection Method. It leverages model preference to assign weights to target examples and builds a preference-aware target direction. PRISM evaluates candidate training samples according to their influence on this direction, and prioritizes data budget allocation to samples that effectively drive the model to match expected target behavior. Theoretical analysis verifies that weighted preference construction generates a superior first-order gradient direction for boosting target preference, compared with uniform aggregation strategies. Extensive experiments covering diverse model architectures and parameter scales demonstrate that PRISM achieves better performance in efficient fine-tuning and safety-aligned supervised fine-tuning rectification. The results validate that accurate characterization of target behavior serves as the core of cost-effective data selection.

Qihao Lin, Guanxu Chen, Dongrui Liu, Jing Shao• 2026

Related benchmarks

Task	Dataset	Result
Dishonesty Evaluation	Mistake math (test)	Benchmark Dishonesty44.16	96
Multi-task Language Understanding	MMLU	MMLU Score54.38	86
Data Ranking	Mistake math	AUROC0.79	84
Dishonesty Evaluation	Insecure code (test)	Benchmark Dishonesty48.91	32
Dishonesty Evaluation	Mistake medical (test)	Dishonesty Accuracy55.84	32
Data Ranking	Insecure code	AUROC0.71	28
Data Ranking	Mistake medical	AUROC63	28
Mathematical Problem Solving	MATH 500	MATH-500 Accuracy6.33	22
Hard Reasoning Tasks	BBH	--	12

Showing 9 of 9 rows

Other info

Follow for update

@wizwand_team Discord