Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

LLM-CAS: Dynamic Neuron Perturbation for Real-Time Hallucination Correction

About

Large language models (LLMs) often generate hallucinated content that lacks factual or contextual grounding, limiting their reliability in critical applications. Existing approaches such as supervised fine-tuning and reinforcement learning from human feedback are data intensive and computationally expensive, while static parameter editing methods struggle with context dependent errors and catastrophic forgetting. We propose LLM-CAS, a framework that formulates real-time hallucination correction as a hierarchical reinforcement learning problem. LLM-CAS trains an agent to learn a policy that dynamically selects temporary neuron perturbations during inference based on the current context. Unlike prior dynamic approaches that rely on heuristic or predefined adjustments, this policy driven mechanism enables adaptive and fine grained correction without permanent parameter modification. Experiments across multiple language models demonstrate that LLM-CAS consistently improves factual accuracy, achieving gains of 10.98 percentage points on StoryCloze, 2.71 points on TriviaQA, and 2.06 points on the MC1 score of TruthfulQA. These results outperform both static editing methods such as ITI and CAA and the dynamic SADI framework. Overall, LLM-CAS provides an efficient and context aware solution for improving the reliability of LLMs, with promising potential for future multimodal extensions.

Jensen Zhang, Ningyuan Liu, Yijia Fan, Zihao Huang, Qinglin Zeng, Kaitong Cai, Jian Wang, Keze Wang• 2025

Related benchmarks

TaskDatasetResultRank
Question AnsweringBoolQ
Accuracy74.47
240
Question AnsweringWinoGrande (WG)
Accuracy52.9
98
Story completionStoryCloze
Accuracy76.04
65
Open-domain Question AnsweringTriviaQA
EM44.31
62
Toxicity DetectionToxigen
Score47.63
25
Open-ended generationTruthfulQA Open-ended
True Score75.12
16
Multiple-choice Question AnsweringTruthfulQA Multiple-choice
MC1 Score35.47
15
Multiple-choice Question AnsweringSST-2
Accuracy91.3
5
Showing 8 of 8 rows

Other info

Follow for update