Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

AgentArk: Distilling Multi-Agent Intelligence into a Single LLM Agent

About

While large language model (LLM) multi-agent systems achieve superior reasoning performance through iterative debate, practical deployment is limited by their high computational cost and error propagation. This paper proposes AgentArk, a novel framework to distill multi-agent dynamics into the weights of a single model, effectively transforming explicit test-time interactions into implicit model capabilities. This equips a single agent with the intelligence of multi-agent systems while remaining computationally efficient. Specifically, we investigate three hierarchical distillation strategies across various models, tasks, scaling, and scenarios: reasoning-enhanced fine-tuning; trajectory-based augmentation; and process-aware distillation. By shifting the burden of computation from inference to training, the distilled models preserve the efficiency of one agent while exhibiting strong reasoning and self-correction performance of multiple agents. They further demonstrate enhanced robustness and generalization across diverse reasoning tasks. We hope this work can shed light on future research on efficient and robust multi-agent development. Our code is at https://github.com/AIFrontierLab/AgentArk.

Yinyi Luo, Yiqiao Jin, Weichen Yu, Mengqi Zhang, Srijan Kumar, Xiaoxiao Li, Weijie Xu, Xin Chen, Jindong Wang• 2026

Related benchmarks

TaskDatasetResultRank
Mathematical ReasoningGSM8K (test)--
797
Mathematical ReasoningMATH--
643
Mathematical ReasoningMATH--
535
Mathematical ReasoningGSM8K--
358
Medical Question AnsweringMedMCQA--
253
Mathematical ReasoningGSM8K--
171
Medical Question AnsweringMedMCQA (test)--
134
Mathematical ReasoningMetaMathQA--
54
Mathematical ReasoningMetaMathQA (test)--
26
Mathematical ReasoningMATH (test)--
18
Showing 10 of 10 rows

Other info

GitHub

Follow for update