Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Beyond In-Domain Detection: SpikeScore for Cross-Domain Hallucination Detection

About

Hallucination detection is critical for deploying large language models (LLMs) in real-world applications. Existing hallucination detection methods achieve strong performance when the training and test data come from the same domain, but they suffer from poor cross-domain generalization. In this paper, we study an important yet overlooked problem, termed generalizable hallucination detection (GHD), which aims to train hallucination detectors on data from a single domain while ensuring robust performance across diverse related domains. In studying GHD, we simulate multi-turn dialogues following LLMs' initial response and observe an interesting phenomenon: hallucination-initiated multi-turn dialogues universally exhibit larger uncertainty fluctuations than factual ones across different domains. Based on the phenomenon, we propose a new score SpikeScore, which quantifies abrupt fluctuations in multi-turn dialogues. Through both theoretical analysis and empirical validation, we demonstrate that SpikeScore achieves strong cross-domain separability between hallucinated and non-hallucinated responses. Experiments across multiple LLMs and benchmarks demonstrate that the SpikeScore-based detection method outperforms representative baselines in cross-domain generalization and surpasses advanced generalization-oriented methods, verifying the effectiveness of our method in cross-domain hallucination detection.

Yongxin Deng, Zhen Fang, Sharon Li, Ling Chen• 2026

Related benchmarks

TaskDatasetResultRank
Hallucination DetectionTriviaQA
AUROC0.8697
621
Hallucination DetectionTriviaQA (test)
AUC-ROC86.97
243
Hallucination DetectionCoQA
Mean AUROC0.8584
107
Hallucination DetectionRAGTruth (test)
AUROC0.8535
99
Hallucination DetectionMATH
Mean AUROC81.57
72
Hallucination DetectionCommonsenseQA
Mean AUROC0.7563
62
Hallucination DetectionRAGTruth
AUROC0.8535
58
Hallucination DetectionSVAMP
Mean AUROC78.37
50
Hallucination DetectionBelebele
Mean AUROC0.7719
48
Hallucination DetectionAverage Cross-domain
Mean AUROC0.7874
48
Showing 10 of 11 rows

Other info

Follow for update