Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Distilling LLM Reasoning into Graph of Concept Predictors

About

Deploying Large Language Models (LLMs) for discriminative workloads is often limited by inference latency, compute, and API costs at scale. Active distillation reduces these costs by querying an LLM oracle to train compact discriminative students, but most pipelines distill only final labels, discarding intermediate reasoning signals and offering limited diagnostics of what reasoning is missing and where errors arise. We propose Graph of Concept Predictors (GCP), a reasoning-aware active distillation framework that externalizes the teacher's decision process as a directed acyclic graph and mirrors it with modular concept predictors in the student. GCP enhances sample efficiency through a graph-aware acquisition strategy that targets uncertainty and disagreement at critical reasoning nodes. Additionally, it improves training stability and efficiency by performing targeted sub-module retraining, which attributes downstream loss to specific concept predictors and updates only the most influential modules. Experiments on eight NLP classification benchmarks demonstrate that GCP enhances performance under limited annotation budgets while yielding more interpretable and controllable training dynamics. Code is available at: https://github.com/Ziyang-Yu/GCP.

Ziyang Yu, Liang Zhao• 2026

Related benchmarks

TaskDatasetResultRank
Text ClassificationAG-News
Accuracy84.73
248
Text ClassificationIMDB
Accuracy88.94
107
Text ClassificationAMAZON
Accuracy91.62
37
Text ClassificationMNLI
Accuracy62.47
32
Text ClassificationYelp
Accuracy94.21
21
Text ClassificationSemEval--
17
Text ClassificationGoEmotions
Accuracy27.89
9
Text ClassificationMIMIC-III
Accuracy86.32
9
Showing 8 of 8 rows

Other info

Follow for update