Skill-Aware Data Selection and Fine-Tuning for Data-Efficient Reasoning Distillation

About

Large reasoning models such as DeepSeek-R1 and their distilled variants achieve strong performance on complex reasoning tasks. Yet, distilling these models often demands large-scale data for supervised fine-tuning (SFT), motivating the pursuit of data-efficient training methods. To address this, we propose a skill-centric distillation framework that efficiently transfers reasoning ability to weaker models with two components: (1) Skill-based data selection, which prioritizes examples targeting the student model's weaker skills, and (2) Skill-aware fine-tuning, which encourages explicit skill decomposition during problem solving. With only 1,000 training examples selected from a 100K teacher-generated corpus, our method surpasses random SFT baselines by +1.6% on Qwen3-4B and +1.4% on Qwen3-8B across five mathematical reasoning benchmarks. Further analysis confirms that these gains concentrate on skills emphasized during training, highlighting the effectiveness of skill-centric training for efficient reasoning distillation.

Lechen Zhang, Yunxiang Zhang, Wei Hu, Lu Wang• 2026

Related benchmarks

Task	Dataset	Result
Mathematical Reasoning	AIME 2025	Accuracy50.8	227
Mathematical Reasoning	AMC 23	Accuracy91.9	198
Mathematical Reasoning	MATH L5	Accuracy0.853	90
Mathematical Reasoning	AIME 2024	Accuracy64.6	25

Showing 4 of 4 rows

Other info

Follow for update

@wizwand_team Discord