Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

From Debate to Deliberation: Structured Collective Reasoning with Typed Epistemic Acts

About

Multi-agent LLM systems increasingly tackle complex reasoning, yet their interaction patterns remain limited to voting, unstructured debate, or pipeline orchestration. None model deliberation: a phased process where differentiated participants exchange typed reasoning moves, preserve disagreements, and converge on accountable outcomes. We introduce Deliberative Collective Intelligence (DCI), specifying four reasoning archetypes, 14 typed epistemic acts, a shared workspace, and DCI-CF, a convergent flow algorithm that guarantees termination with a structured decision packet containing the selected option, residual objections, minority report, and reopen conditions. We evaluate on 45 tasks across seven domains using Gemini 2.5 Flash. On non-routine tasks (n=40), DCI significantly improves over unstructured debate (+0.95, 95% CI [+0.41, +1.54]). DCI excels on hidden-profile tasks requiring perspective integration (9.56, highest of any system on any domain) while failing on routine decisions (5.39), confirming task-dependence. DCI produces 100% structured decision packets and 98% minority reports, artifacts absent from all baselines. However, DCI consumes ~62x single-agent tokens, and single-agent generation outperforms DCI on overall quality. DCI's contribution is not that more agents are better, but that consequential decisions benefit from deliberative structure when process accountability justifies the cost.

Sunil Prakash• 2026

Related benchmarks

TaskDatasetResultRank
Decision MakingDeliberative decision-making tasks n=45 (overall)
Mean Tokens2.38e+5
5
Hidden-Profile IntegrationDCI Evaluation Suite Hidden-Prof
Quality Score9.56
5
Process Artifact AnalysisDeliberative Decision-Making Evaluation Set
Decision Packet Completeness100
5
Late-Evidence AnalysisDCI Evaluation Suite Late-Evid.
Quality Score9.24
5
Policy AnalysisDCI Evaluation Suite Policy
Quality Score8.55
5
Risk AssessmentDCI Evaluation Suite Risk
Quality Score8.48
5
Disagreement HandlingDCI Evaluation Suite Disagree
Quality Score8.15
5
Reasoning evaluationFull task set (n=45)
Overall Score8.24
5
Software ArchitectureDCI Evaluation Suite Arch.
Quality Score8.13
5
Routine Task ManagementDCI Evaluation Suite Routine
Quality Score5.39
5
Showing 10 of 10 rows

Other info

Follow for update