Entailed Opinion Matters: Improving the Fact-Checking Performance of Language Models by Relying on their Entailment Ability

About

Automated fact-checking has been a challenging task for the research community. Prior work has explored various strategies, such as end-to-end training, retrieval-augmented generation, and prompt engineering, to build robust fact-checking systems. However, their accuracy has not been high enough for real-world deployment. We, on the other hand, propose a new learning paradigm, where evidence classification and entailed justifications made by generative language models (GLMs) are used to train encoder-only language models (ELMs). We conducted a rigorous set of experiments, comparing our approach with recent works along with various prompting and fine-tuning strategies. Additionally, we performed ablation studies, error analysis, quality analysis of model explanations, and a domain generalisation study to provide a comprehensive understanding of our approach.

Gaurav Kumar, Ayush Garg, Debajyoti Mazumder, Aditya Kishore, Babu kumar, Jasabanta Patro• 2025

Related benchmarks

Task	Dataset	Result
Veracity Prediction	LIAR RAW	--	32
Veracity Prediction	RAWFC (test)	--	28
Veracity Prediction	Ru22fact (test)	MF173	25
Veracity Prediction	Factify-2	Macro F178	13
Veracity Prediction	Mocheg	Macro F1 Score57	13
Veracity Prediction	VERITE	Macro F164	13
Veracity Prediction	X-Fact (test)	Macro F146	13

Showing 7 of 7 rows

Other info

Follow for update

@wizwand_team Discord