Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

ReConcile: Round-Table Conference Improves Reasoning via Consensus among Diverse LLMs

About

Large Language Models (LLMs) still struggle with natural language reasoning tasks. Motivated by the society of minds (Minsky, 1988), we propose ReConcile, a multi-model multi-agent framework designed as a round table conference among diverse LLM agents. ReConcile enhances collaborative reasoning between LLM agents via multiple rounds of discussion, learning to convince other agents to improve their answers, and employing a confidence-weighted voting mechanism that leads to a better consensus. In each round, ReConcile initiates discussion between agents via a 'discussion prompt' that consists of (a) grouped answers and explanations generated by each agent in the previous round, (b) their confidence scores, and (c) demonstrations of answer-rectifying human explanations, used for convincing other agents. Experiments on seven benchmarks demonstrate that ReConcile significantly improves LLMs' reasoning -- both individually and as a team -- surpassing prior single-agent and multi-agent baselines by up to 11.4% and even outperforming GPT-4 on three datasets. ReConcile also flexibly incorporates different combinations of agents, including API-based, open-source, and domain-specific models, leading to an 8% improvement on MATH. Finally, we analyze the individual components of ReConcile, demonstrating that the diversity originating from different models is critical to its superior performance. Code: https://github.com/dinobby/ReConcile

Justin Chih-Yao Chen, Swarnadeep Saha, Mohit Bansal• 2023

Related benchmarks

TaskDatasetResultRank
Question AnsweringARC Challenge--
906
Mathematical ReasoningGSM8K (test)
Accuracy89.8
900
Mathematical ReasoningMATH
Accuracy50.7
882
Medical Question AnsweringMedMCQA
Accuracy52
346
Code GenerationMBPP (test)
Pass@177.2
298
Long-context Language UnderstandingLongBench
M-Avg52.55
292
Mathematical ReasoningAIME 2025
Accuracy70
227
Visual Question AnsweringA-OKVQA
Acc65.5
202
Science Question AnsweringARC-C--
193
Graduate-level Question AnsweringGPQA
Accuracy30.8
184
Showing 10 of 62 rows

Other info

Follow for update