Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

GroupDebate: Enhancing the Efficiency of Multi-Agent Debate Using Group Discussion

About

In recent years, Large Language Models (LLMs) have demonstrated remarkable capabilities across diverse NLP tasks. Extensive research has explored how to enhance the logical reasoning abilities such as Chain-of-Thought, Chain-of-Thought with Self-Consistency, Tree-Of-Thoughts, and multi-agent debates. In the context of multi-agent debates, significant performance improvements can be achieved with an increasing number of agents and debate rounds. However, the escalation in the number of agents and debate rounds can drastically raise the tokens cost of debates, thereby limiting the scalability of the multi-agent debate technique. To better harness the advantages of multi-agent debates in logical reasoning tasks, this paper proposes a method to significantly reduce token cost in multi-agent debates. This approach involves dividing all agents into multiple debate groups, with agents engaging in debates within their respective groups and sharing interim debate results between groups. Comparative experiments across multiple datasets have demonstrated that this method can reduce the total tokens by up to 51.7% during debates and while potentially enhancing accuracy by as much as 25%. Our method significantly enhances the performance and efficiency of interactions in the multi-agent debate.

Tongxuan Liu, Xingyu Wang, Weizhe Huang, Wenjiang Xu, Yuting Zeng, Lei Jiang, Hailong Yang, Jing Li• 2024

Related benchmarks

TaskDatasetResultRank
Question AnsweringARC Challenge--
749
Mathematical ReasoningMATH
Accuracy50.3
643
Long-context Language UnderstandingLongBench
M-Avg55.97
219
Science Question AnsweringARC-C--
127
Graduate-level Question AnsweringGPQA
Accuracy34.2
114
Question AnsweringSQuAD
Exact Match90.33
50
Language UnderstandingMMLU
RA74
31
Multitask Language UnderstandingMMLU-Pro
RA51.67
16
Long-context UnderstandingLongBench
Average Context Length (tokens)4.48e+5
16
Mathematical ReasoningMATH
Avg Context Length (tokens)4.22e+3
16
Showing 10 of 16 rows

Other info

Follow for update