Beyond In-Domain Detection: SpikeScore for Cross-Domain Hallucination Detection

About

Hallucination detection is critical for deploying large language models (LLMs) in real-world applications. Existing hallucination detection methods achieve strong performance when the training and test data come from the same domain, but they suffer from poor cross-domain generalization. In this paper, we study an important yet overlooked problem, termed generalizable hallucination detection (GHD), which aims to train hallucination detectors on data from a single domain while ensuring robust performance across diverse related domains. In studying GHD, we simulate multi-turn dialogues following LLMs' initial response and observe an interesting phenomenon: hallucination-initiated multi-turn dialogues universally exhibit larger uncertainty fluctuations than factual ones across different domains. Based on the phenomenon, we propose a new score SpikeScore, which quantifies abrupt fluctuations in multi-turn dialogues. Through both theoretical analysis and empirical validation, we demonstrate that SpikeScore achieves strong cross-domain separability between hallucinated and non-hallucinated responses. Experiments across multiple LLMs and benchmarks demonstrate that the SpikeScore-based detection method outperforms representative baselines in cross-domain generalization and surpasses advanced generalization-oriented methods, verifying the effectiveness of our method in cross-domain hallucination detection.

Yongxin Deng, Zhen Fang, Sharon Li, Ling Chen• 2026

Related benchmarks

Task	Dataset	Result
Hallucination Detection	TriviaQA	AUROC0.8697	621
Hallucination Detection	TriviaQA (test)	AUC-ROC86.97	243
Hallucination Detection	CoQA	Mean AUROC0.8584	107
Hallucination Detection	RAGTruth (test)	AUROC0.8535	99
Hallucination Detection	MATH	Mean AUROC81.57	72
Hallucination Detection	CommonsenseQA	Mean AUROC0.7563	62
Hallucination Detection	RAGTruth	AUROC0.8535	58
Hallucination Detection	SVAMP	Mean AUROC78.37	50
Hallucination Detection	Belebele	Mean AUROC0.7719	48
Hallucination Detection	Average Cross-domain	Mean AUROC0.7874	48

Showing 10 of 11 rows

Other info

Follow for update

@wizwand_team Discord