Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

AgentDropoutV2: Optimizing Information Flow in Multi-Agent Systems via Test-Time Rectify-or-Reject Pruning

About

While Multi-Agent Systems (MAS) excel in complex reasoning, they suffer from the cascading impact of erroneous information from individual agents. Current solutions often resort to rigid structural engineering or expensive fine-tuning, limiting their adaptability. We propose AgentDropoutV2 (ADv2), a test-time rectify-or-reject pruning framework that dynamically optimizes MAS information flow. Acting as an active firewall, ADv2 intercepts agent outputs and employs a retrieval-augmented rectifier to iteratively correct errors. This rectification is guided by an indicator pool, which is constructed offline by distilling error patterns from historical MAS failure trajectories. Irreparable outputs are subsequently pruned to prevent error propagation. Empirical results demonstrate that ADv2 significantly boosts performance on both fixed and dynamic MAS frameworks, achieving average accuracy gains of 6.39 and 2.28 percentage points on extensive math and code benchmarks, respectively. Furthermore, ADv2 exhibits remarkable adaptivity, dynamically modulating rectification efforts based on task difficulty to resolve a wide spectrum of error patterns. Our code is released at https://github.com/TonySY2/AgentDropoutV2.

Yutong Wang, Siyuan Xiong, Xuebo Liu, Wenkang Zhou, Liang Ding, Miao Zhang, Min Zhang• 2026

Related benchmarks

TaskDatasetResultRank
Mathematical ReasoningAIME 2024
Accuracy43.33
220
Mathematical ReasoningAIME 2025
Accuracy26.67
214
Mathematical ReasoningAMC 23
Accuracy70
113
Mathematical ReasoningAQUA
Accuracy87.01
45
Mathematical ReasoningAMC23
Accuracy77.5
38
Code GenerationCodeContests
Accuracy7.27
30
Mathematical ReasoningOlymMATH Easy
Pass@132.5
16
Mathematical ReasoningOlymMATH Hard
Accuracy (OlymMATH Hard)23.75
13
Code GenerationLiveCodeBench
Accuracy32
8
Mathematical ReasoningOlymB
Accuracy53.93
3
Showing 10 of 12 rows

Other info

GitHub

Follow for update