Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Debating Truth: Debate-driven Claim Verification with Multiple Large Language Model Agents

About

State-of-the-art single-agent claim verification methods struggle with complex claims that require nuanced analysis of multifaceted evidence. Inspired by real-world professional fact-checkers, we propose \textbf{DebateCV}, the first debate-driven claim verification framework powered by multiple LLM agents. In DebateCV, two \textit{Debaters} argue opposing stances to surface subtle errors in single-agent assessments. A decisive \textit{Moderator} is then required to weigh the evidential strength of conflicting arguments to deliver an accurate verdict. Yet, zero-shot Moderators are biased toward neutral judgments, and no datasets exist for training them. To bridge this gap, we propose \textbf{Debate-SFT}, a post-training framework that leverages synthetic data to enhance agents' ability to effectively adjudicate debates for claim verification. Results show that our methods surpass state-of-the-art non-debate approaches in both accuracy (across various evidence conditions) and justification quality.

Haorui He, Yupeng Li, Dacheng Wen, Yang Chen, Reynold Cheng, Donglong Chen, Francis C. M. Lau• 2025

Related benchmarks

TaskDatasetResultRank
Claim VerificationAVeriTeC Golden (dev)
Accuracy83.4
28
Claim VerificationAVeriTeC Retrieved (H) (dev)
Accuracy72.8
28
Claim VerificationAVeriTeC Retrieved (I) (dev)
Accuracy73.6
28
Justification Quality EvaluationAVeriTeC Retrieved (H) 50 correctly verified claims
MOS3.67
6
Showing 4 of 4 rows

Other info

Follow for update