Token-Guard: Towards Token-Level Hallucination Control via Self-Checking Decoding
About
Large Language Models (LLMs) often hallucinate, generating content inconsistent with the input. Retrieval-Augmented Generation (RAG) and Reinforcement Learning with Human Feedback (RLHF) can mitigate hallucinations but require resource-intensive retrieval or large-scale fine-tuning. Decoding-based methods are lighter yet lack explicit hallucination control. To address this, we present Token-Guard, a token-level hallucination control method based on self-checking decoding. Token-Guard performs internal verification at each reasoning step to detect hallucinated tokens before they propagate. Candidate fragments are further evaluated in a latent space with explicit hallucination risk scoring, while iterative pruning and regeneration dynamically correct detected errors. Experiments on HALU datasets show Token-Guard substantially reduces hallucinations and improves generation accuracy, offering a scalable, modular solution for reliable LLM outputs. Our code is publicly available.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Question Answering | PubMedQA | EM0.00e+0 | 18 | |
| Question Answering | RAGTruth | F1 Score45.89 | 17 | |
| Question Answering | CovidQA | F147.64 | 17 | |
| Question Answering | DROP nfl | F1 Score67.69 | 17 | |
| Question Answering | FinanceBench | EM45 | 12 | |
| Question Answering | HaluEval | EM68 | 12 | |
| Grounded Text Generation | RAGTruth | F1 Score33.14 | 11 | |
| Grounded Text Generation | DROP history | F151.17 | 11 | |
| Grounded Text Generation | HaluEval | F1 Score72.66 | 11 | |
| Question Answering | DROP history | F1 Score68.52 | 5 |