HalluZig: Hallucination Detection using Zigzag Persistence

About

The factual reliability of Large Language Models (LLMs) remains a critical barrier to their adoption in high-stakes domains due to their propensity to hallucinate. Current detection methods often rely on surface-level signals from the model's output, overlooking the failures that occur within the model's internal reasoning process. In this paper, we introduce a new paradigm for hallucination detection by analyzing the dynamic topology of the evolution of model's layer-wise attention. We model the sequence of attention matrices as a zigzag graph filtration and use zigzag persistence, a tool from Topological Data Analysis, to extract a topological signature. Our core hypothesis is that factual and hallucinated generations exhibit distinct topological signatures. We validate our framework, HalluZig, on multiple benchmarks, demonstrating that it outperforms strong baselines. Furthermore, our analysis reveals that these topological signatures are generalizable across different models and hallucination detection is possible only using structural signatures from partial network depth.

Shreyas N. Samaga, Gilberto Gonzalez Arroyo, Tamal K. Dey• 2026

Related benchmarks

Task	Dataset	Result
Hallucination Detection	TruthfulQA (test)	AUC-ROC73.3	112
Hallucination Detection	NQ-Open	AUROC0.73	63
Hallucination Detection	FAVA Annotated Dataset	AUCROC83.28	16
Hallucination Detection	RAGTruth Summarization (Llama-2-7b)	AUCROC73.37	4
Hallucination Detection	RAGTruth Summarization (Llama-2-13b)	AUCROC72.9	4
Hallucination Detection	RAGTruth Summarization Mistral-7b	AUCROC74.45	4

Showing 6 of 6 rows

Other info

Follow for update

@wizwand_team Discord