Weak-Link Optimization for Multi-Agent Reasoning and Collaboration

About

LLM-driven multi-agent frameworks address complex reasoning tasks through multi-role collaboration. However, existing approaches often suffer from reasoning instability, where individual agent errors are amplified through collaboration, undermining overall performance. Current research mainly focuses on enhancing high-capability agents or suppressing unreliable outputs to improve framework effectiveness, while systematic identification and reinforcement of performance-limiting agents receive less attention. To address this gap, we propose WORC, a \underline{w}eak-link \underline{o}ptimization framework for multi-agent \underline{r}easoning and \underline{c}ollaboration, grounded in the weak-link principle. WORC follows a two-stage workflow. In the weak agent localization stage, task features are constructed, and a meta-learning-based weight predictor trained on optimal configurations identified by swarm intelligence algorithms (SIAs) enables zero-shot mapping from these features to agent performance weights, where the agent with the lowest predicted weight is identified as the weak agent. In the weak-link optimization stage, an uncertainty-driven allocation strategy assigns additional reasoning budgets to weak agents, with lower predicted weights leading to larger repeated-sampling quotas to compensate for reliability deficiencies. Experimental results show that WORC achieves an average accuracy of 82.2\% on reasoning benchmarks while improving framework stability and cross-architecture generalization, suggesting that compensating for weak links, rather than reinforcing strengths alone, enhances the robustness of multi-agent systems.

Haoyu Bian, Chaoning Zhang, Jiaquan Zhang, Xingyao Li, Yuanfang Guo, Wei Dong, Yang Yang• 2026

Related benchmarks

Task	Dataset	Result
Reasoning	BBH	Accuracy86.9	770
General Knowledge Reasoning	MMLU CF	Accuracy71.7	64
Mathematical Reasoning	MATH	Exact Match Accuracy87	39
Multi-hop Question Answering	HotpotQA	F1 Score83.2	18
Mathematical Reasoning	GSM8K	Exact Match Accuracy95.9	9
Long-context Reasoning	LongBench	F1 Score68.4	9

Showing 6 of 6 rows

Other info

Follow for update

@wizwand_team Discord