GraphCheck: Breaking Long-Term Text Barriers with Extracted Knowledge Graph-Powered Fact-Checking

About

Large language models (LLMs) are widely used, but they often generate subtle factual errors, especially in long-form text. These errors are fatal in some specialized domains such as medicine. Existing fact-checking with grounding documents methods face two main challenges: (1) they struggle to understand complex multihop relations in long documents, often overlooking subtle factual errors; (2) most specialized methods rely on pairwise comparisons, requiring multiple model calls, leading to high resource and computational costs. To address these challenges, we propose GraphCheck, a fact-checking framework that uses extracted knowledge graphs to enhance text representation. Graph Neural Networks further process these graphs as a soft prompt, enabling LLMs to incorporate structured knowledge more effectively. Enhanced with graph-based reasoning, GraphCheck captures multihop reasoning chains that are often overlooked by existing methods, enabling precise and efficient fact-checking in a single inference call. Experimental results on seven benchmarks spanning both general and medical domains demonstrate up to a 7.1% overall improvement over baseline models. Notably, GraphCheck outperforms existing specialized fact-checkers and achieves comparable performance with state-of-the-art LLMs, such as DeepSeek-V3 and OpenAI-o1, with significantly fewer parameters.

Yingjian Chen, Haoran Liu, Yinhong Liu, Jinxiang Xie, Rui Yang, Han Yuan, Yanran Fu, Peng Yuan Zhou, Qingyu Chen, James Caverlee, Irene Li• 2025

Related benchmarks

Task	Dataset	Result
Fact Checking	COVID-Fact	Balanced Acc71.5	32
Fact Checking	PubHealth	Balanced Accuracy73.6	26
Fact Checking	ExpertQA	Balanced Accuracy60.3	25
Scientific Fact Verification	SciFact	--	25
Fact Checking	SciFact	Balanced Acc89.4	15
Fact Checking	AggreFact CNN	Balanced Acc66.5	15
Fact Checking	SummEval	Balanced Accuracy71	15
Fact Checking	Average across General and Medical Domains	Overall Average71.1	15
Fact Checking	AggreFact Xsum	Balanced Accuracy72.9	15
Fact Checking	Reveal	Balanced Accuracy89.7	7

Showing 10 of 12 rows

Other info

Code

Follow for update

@wizwand_team Discord