Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Schr\"oMind: Mitigating Hallucinations in Multimodal Large Language Models via Solving the Schr\"odinger Bridge Problem

About

Recent advancements in Multimodal Large Language Models (MLLMs) have achieved significant success across various domains. However, their use in high-stakes fields like healthcare remains limited due to persistent hallucinations, where generated text contradicts or ignores visual input. We contend that MLLMs can comprehend images but struggle to produce accurate token sequences. Minor perturbations can shift attention from truthful to untruthful states, and the autoregressive nature of text generation often prevents error correction. To address this, we propose Schr\"oMind-a novel framework reducing hallucinations via solving the Schr\"odinger bridge problem. It establishes a token-level mapping between hallucinatory and truthful activations with minimal transport cost through lightweight training, while preserving the model's original capabilities. Extensive experiments on the POPE and MME benchmarks demonstrate the superiority of Schr\"odinger, which achieves state-of-the-art performance while introducing only minimal computational overhead.

Ziqiang Shi, Rujie Liu, Shanshan Yu, Satoshi Munakata, Koichi Shirahata• 2026

Related benchmarks

TaskDatasetResultRank
Object Hallucination EvaluationMS-COCO (POPE Adversarial)
Accuracy85.43
80
Object Hallucination EvaluationMS-COCO POPE (Popular)
Accuracy87.1
76
Object Hallucination EvaluationMS-COCO POPE Random
Accuracy90.86
55
Object Hallucination ProbingGQA POPE Popular
Accuracy84.83
33
Object Hallucination ProbingA-OKVQA (Adversarial split)
Accuracy79.1
27
Object Hallucination ProbingGQA Adversarial
Accuracy81.76
24
Object presence hallucination evaluationPOPE A-OKVQA Popular 2022
Accuracy85.63
15
Object presence hallucination evaluationPOPE GQA 2019 (Random)
Accuracy89.23
15
Object Hallucination ProbingA-OKVQA (Random split)
Accuracy90.83
12
Showing 9 of 9 rows

Other info

Follow for update