Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Debate Only When Necessary: Adaptive Multiagent Collaboration for Efficient LLM Reasoning

About

Multiagent collaboration has emerged as a promising framework for enhancing the reasoning capabilities of large language models (LLMs). Despite improvements in reasoning, the approach introduces substantial computational overhead resulting from iterative agent interactions. Furthermore, engaging in unnecessary debates increases the risk of generating erroneous responses. To address these challenges, we propose Debate Only When Necessary (DOWN), an adaptive multiagent debate framework that selectively activates debate based on the confidence score of the agent's initial response. Debate is activated only for queries requiring further deliberation, during which agents refine their outputs by referencing peer responses and associated confidence scores. Evaluations on benchmarks show that DOWN improves efficiency by up to six times while preserving or even outperforming the performance of existing methods. Further analysis indicates that DOWN effectively mitigates the risk of error propagation stemming from the unnecessary debate process. These findings demonstrate the effectiveness of our approach in delivering high-performance LLM solutions at a lower computational cost.

Sugyeong Eo, Hyeonseok Moon, Evelyn Hayoon Zi, Chanjun Park, Heuiseok Lim• 2025

Related benchmarks

TaskDatasetResultRank
Algebraic ReasoningAQUA
Accuracy84.65
61
Graduate-Level ReasoningGPQA
Accuracy51.01
41
Multitask Language UnderstandingMMLU
Accuracy84.06
34
Aggregate Reasoning EvaluationMulti-dataset Reasoning Suite
Average Accuracy80.09
12
Commonsense ReasoningCommonQA
Accuracy84.11
12
Showing 5 of 5 rows

Other info

Follow for update