Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

REFLEX: Self-Refining Explainable Fact-Checking via Verdict-Anchored Style Control

About

The prevalence of fake news on social media demands automated fact-checking systems to provide accurate verdicts with faithful explanations. However, existing large language model (LLM)-based approaches ignore deceptive misinformation styles in LLM-generated explanations, resulting in unfaithful rationales that can mislead human judgments. They rely heavily on external knowledge sources, introducing hallucinations and even high latency that undermine reliability and responsiveness, which is crucial for real-time use. To address these challenges, we propose REason-guided Fact-checking with Latent EXplanations (REFLEX), a self-refining paradigm that explicitly controls reasoning style anchored on verdict. REFLEX utilizes self-disagreement veracity signals between the backbone model and its fine-tuned variant to construct steering vectors, naturally disentangling fact from style. Experiments on the real-world dataset show REFLEX achieves state-of-the-art performance under LLaMA-series models with only 465 self-refined samples. Moreover, owing to its transferability, REFLEX yields up to a 7.54% gain on in-the-wild data. Our results further demonstrate that our method effectively mitigates faithful hallucination, thereby guiding the model toward more accurate verdicts than previous works in explainable fact-checking.

Chuyi Kong, Gao Wei, Jing Ma, Hongzhan Lin, Yuxi Sun• 2025

Related benchmarks

TaskDatasetResultRank
Veracity PredictionLIAR RAW
Macro F150.59
32
Veracity PredictionRAWFC
Macro F1 Score64.99
26
Explanation quality evaluationLIAR-RAW (test)
ChatGPT Meaningfulness Score1.65
7
Explanation quality evaluationRAW-FC
M Score1.9
7
Explanation quality evaluationLIAR RAW
Meaningfulness Score1.9
7
Showing 5 of 5 rows

Other info

Follow for update