Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Octopus: Alleviating Hallucination via Dynamic Contrastive Decoding

About

Large Vision-Language Models (LVLMs) have obtained impressive performance in visual content understanding and multi-modal reasoning. Unfortunately, these large models suffer from serious hallucination problems and tend to generate fabricated responses. Recently, several Contrastive Decoding (CD) strategies have been proposed to alleviate hallucination by introducing disturbed inputs. Although great progress has been made, these CD strategies mostly apply a one-size-fits-all approach for all input conditions. In this paper, we revisit this process through extensive experiments. Related results show that hallucination causes are hybrid and each generative step faces a unique hallucination challenge. Leveraging these meaningful insights, we introduce a simple yet effective Octopus-like framework that enables the model to adaptively identify hallucination types and create a dynamic CD workflow. Our Octopus framework not only outperforms existing methods across four benchmarks but also demonstrates excellent deployability and expansibility. Code is available at https://github.com/LijunZhang01/Octopus.

Wei Suo, Lijun Zhang, Mengyang Sun, Lin Yuanbo Wu, Peng Wang, Yanning Zhang• 2025

Related benchmarks

TaskDatasetResultRank
Hallucination EvaluationMMHal-Bench
MMHal Score2.61
306
Hallucination EvaluationAMBER
CHAIR6.1
222
Hallucination EvaluationPOPE--
217
Generative HallucinationAMBER Generative
Coverage (%)49.2
81
Object Hallucination EvaluationMSCOCO POPE
Random Accuracy87.51
71
Discriminative Object HallucinationPOPE MSCOCO Adversarial
Accuracy82.83
43
Generative HallucinationObject-HalBench
CHAIR_S Score20.8
43
Object Hallucination Mitigation on Generative TasksAMBER
CHAIR4.8
38
Discriminative Object HallucinationPOPE MSCOCO (Random)
Accuracy87.51
29
Discriminative Object HallucinationPOPE MSCOCO Popular
Accuracy84.9
29
Showing 10 of 19 rows

Other info

Code

Follow for update