Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Multimodal Hallucination Evaluation on HallBench
Loading...
65.2
Score
GPT-5-Thinking
48.352
52.726
57.1
61.474
Sep 30, 2025
Score
Updated 19d ago
Evaluation Results
Method
Method
Links
Score
GPT-5-Thinking
Model Category=Close-s...
2025.09
65.2
Gemini-2.5-Pro
Model Category=Close-s...
2025.09
64.1
VAPO-Thinker-7B
Model Category=Our Models
2025.09
57.4
VLAA-Thinker-7B
Model Category=Open-so...
2025.09
54.7
R1-OneVision-7B
Model Category=Open-so...
2025.09
52.5
Qwen2.5-VL-7B
Model Category=Open-so...
2025.09
50
Vision-R1-7B
Model Category=Open-so...
2025.09
49.5
VAPO-Thinker-3B
Model Category=Our Models
2025.09
49.5
InternVL2.5-8B
Model Category=Open-so...
2025.09
49
Feedback
Search any
task
Search any
task