Hallucination Detection in LLMs with Topological Divergence on Attention Graphs

About

Hallucination, i.e., generating factually incorrect content, remains a critical challenge for large language models (LLMs). We introduce TOHA, a TOpology-based HAllucination detector in the RAG setting, which leverages a topological divergence metric to quantify the structural properties of graphs induced by attention matrices. Examining the topological divergence between prompt and response subgraphs reveals consistent patterns: higher divergence values in specific attention heads correlate with hallucinated outputs, independent of the dataset. Extensive experiments - including evaluation on question answering and summarization tasks - show that our approach achieves state-of-the-art or competitive results on several benchmarks while requiring minimal annotated data and computational resources. Our findings suggest that analyzing the topological structure of attention matrices can serve as an efficient and robust indicator of factual reliability in LLMs.

Alexandra Bazarova, Andrei Volodichev, Aleksandr Yugay, Andrey Shulga, Alina Ermilova, Konstantin Polev, Julia Belikova, Rauf Parchiev, Dmitry Simakov, Maxim Savchenko, Andrey Savchenko, Serguei Barannikov, Alexey Zaytsev• 2025

Related benchmarks

Task	Dataset	Result
Hallucination Detection	TriviaQA	AUROC0.874	621
Hallucination Detection	HotpotQA	AUROC0.71	249
Hallucination Detection	TruthfulQA	AUC (ROC)0.811	178
Hallucination Detection	GSM8K	AUROC84.5	115
Hallucination Detection	NQ-Open	AUROC0.818	63
Hallucination Detection	HaluEvalQA	ROC-AUC88.1	39
Hallucination Detection	SQuAD v2	ROC-AUC0.787	28
Hallucination Detection	UMWP	ROC-AUC87.2	28

Showing 8 of 8 rows

Other info

Follow for update

@wizwand_team Discord