Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Entropy Distribution as a Fingerprint for Hallucinations in Generative Models

About

Large Language Models (LLMs) often generate factually incorrect outputs, commonly termed hallucinations, that undermine trust and limit deployment in high-stakes settings. Existing hallucination detection methods typically require multiple forward passes, or access to model internals. In this work, we provide theoretical background and empirical evidence that the distribution of token-level entropies, beyond the mean captured by perplexity or length-normalised entropy, serves as a fingerprint of hallucination, with distributional shape and tail behaviour carrying independent signal. We formalize hallucination detection as a statistical hypothesis test and propose the Calibrated Entropy Score (CES), a lightweight algorithm requiring only a single forward pass and black-box access to token logits. CES combines the mean signal with the maximum signal of the generated entropy through a calibrated reference CDF, producing scores that are directly comparable across models and tasks. We establish finite-sample calibration guarantees via a novel random-length Dvoretzky--Kiefer--Wolfowitz inequality, and also prove that CES detects hallucinations with probability converging to one exponentially fast in the generation length. Across eight QA benchmarks and ten generator models spanning open-source and API access models, CES achieves the highest detection performance among all single-pass black-box methods while providing formal error guarantees that existing heuristics lack. Remarkably, CES is statistically indistinguishable from multi-sample methods that require far greater computational cost, closing the gap between lightweight and expensive detection and making it suitable for real-time, large-scale deployment.

Mattia J. Villani, Pranav Deshpande, Akshay Seshadri, Romina Yalovetzky, Niraj Kumar• 2026

Related benchmarks

TaskDatasetResultRank
Hallucination DetectionTriviaQA--
621
Hallucination DetectionGSM8K--
115
Hallucination DetectionCoQA
AUROC0.694
108
Hallucination DetectionBioASQ
AUROC0.698
104
Hallucination DetectionSQuAD--
82
Hallucination DetectionNQ-Open--
63
Hallucination DetectionSVAMP--
50
Hallucination Detection80 Experiments Aggregated (test)
Average Rank6.29
10
Hallucination DetectionDROP--
2
Showing 9 of 9 rows

Other info

Follow for update