Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Image+Text-to-Text Hallucination Evaluation on MMHal-Bench
Loading...
79
BERT Score
TIGER
66.728
69.914
73.1
76.286
May 29, 2026
BERT Score
Updated 1d ago
Evaluation Results
Method
Method
Links
BERT Score
TIGER
Backbone=LLaVA-1.5-7B
2026.05
79
BoN+VisPRM
Backbone=LLaVA-1.5-7B
2026.05
78.1
DeGF
Backbone=LLaVA-1.5-7B
2026.05
78
BoN+CLIP
Backbone=LLaVA-1.5-7B
2026.05
77.9
BoN+CycRew
Backbone=LLaVA-1.5-7B
2026.05
77.7
VCD
Backbone=LLaVA-1.5-7B
2026.05
77
Volcano
Backbone=LLaVA-1.5-7B
2026.05
76.7
Frozen
Backbone=LLaVA-1.5-7B
2026.05
75.7
TIGER
Backbone=Qwen2.5-Omni-7B
2026.05
75.6
Volcano
Backbone=Qwen2.5-Omni-7B
2026.05
74.2
VCD
Backbone=Qwen2.5-Omni-7B
2026.05
72
BoN+CycRew
Backbone=Qwen2.5-Omni-7B
2026.05
71.9
BoN+CLIP
Backbone=Qwen2.5-Omni-7B
2026.05
71.7
BoN+VisPRM
Backbone=Qwen2.5-Omni-7B
2026.05
71.4
Woodpecker
Backbone=Qwen2.5-Omni-7B
2026.05
70.6
DeGF
Backbone=Qwen2.5-Omni-7B
2026.05
70.6
Frozen
Backbone=Qwen2.5-Omni-7B
2026.05
70.3
Woodpecker
Backbone=LLaVA-1.5-7B
2026.05
67.2
Feedback
Search any
task
Search any
task