Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Hallucination-aware intermediate representation edit in large vision-language models

About

Large Vision-Language Models have demonstrated exceptional performance in multimodal reasoning and complex scene understanding. However, these models still face significant hallucination issues, where outputs contradict visual facts. Recent research on hallucination mitigation has focused on retraining methods and Contrastive Decoding (CD) methods. While both methods perform well, retraining methods require substantial training resources, and CD methods introduce dual inference overhead. These factors hinder their practical applicability. To address the above issue, we propose a framework for dynamically detecting hallucination representations and performing hallucination-eliminating edits on these representations. With minimal additional computational cost, we achieve state-of-the-art performance on existing benchmarks. Extensive experiments demonstrate the effectiveness of our approach, highlighting its efficient and robust hallucination elimination capability and its powerful controllability over hallucinations. Code is available at https://github.com/ASGO-MM/HIRE

Wei Suo, Hanzu Zhang, Lijun Zhang, Ji Ma, Peng Wang, Yanning Zhang• 2026

Related benchmarks

TaskDatasetResultRank
Hallucination EvaluationAMBER
CHAIR8.6
172
Object Hallucination EvaluationCHAIR--
108
Object Hallucination EvaluationMSCOCO POPE
Random Accuracy90.37
47
Object Hallucination EvaluationPOPE GQA (test)
Average Accuracy84.72
29
Hallucination EvaluationCHAIR MSCOCO 2014
CHAIRs Score39
28
Object Hallucination EvaluationA-OKVQA POPE
Random Accuracy88.9
21
Showing 6 of 6 rows

Other info

Follow for update