Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Skill-Aware Data Selection and Fine-Tuning for Data-Efficient Reasoning Distillation

About

Large reasoning models such as DeepSeek-R1 and their distilled variants achieve strong performance on complex reasoning tasks. Yet, distilling these models often demands large-scale data for supervised fine-tuning (SFT), motivating the pursuit of data-efficient training methods. To address this, we propose a skill-centric distillation framework that efficiently transfers reasoning ability to weaker models with two components: (1) Skill-based data selection, which prioritizes examples targeting the student model's weaker skills, and (2) Skill-aware fine-tuning, which encourages explicit skill decomposition during problem solving. With only 1,000 training examples selected from a 100K teacher-generated corpus, our method surpasses random SFT baselines by +1.6% on Qwen3-4B and +1.4% on Qwen3-8B across five mathematical reasoning benchmarks. Further analysis confirms that these gains concentrate on skills emphasized during training, highlighting the effectiveness of skill-centric training for efficient reasoning distillation.

Lechen Zhang, Yunxiang Zhang, Wei Hu, Lu Wang• 2026

Related benchmarks

TaskDatasetResultRank
Mathematical ReasoningAIME 2025
Accuracy50.8
227
Mathematical ReasoningAMC 23
Accuracy91.9
198
Mathematical ReasoningMATH L5
Accuracy0.853
86
Mathematical ReasoningAIME 2024
Accuracy64.6
25
Showing 4 of 4 rows

Other info

Follow for update