Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
General Multimodal Performance on POPE, HallusionBench, MMStar Average
Loading...
69.3
Overall Score
PStar
48.084
53.592
59.1
64.608
May 19, 2026
Overall Score
Updated 14d ago
Evaluation Results
Method
Method
Links
Overall Score
PStar
Backbone=Qwen2.5-VL-7B
2026.05
69.3
PStar
Backbone=Qwen2-VL-7B
2026.05
69
Qwen2.5-VL-7B
Model=Qwen2.5-VL-7B*
2026.05
67.3
Qwen2-VL-7B
Model=Qwen2-VL-7B*
2026.05
66.5
InternVL2-8B
Model=InternVL2-8B*
2026.05
63.6
PStar
Backbone=Qwen2-VL-2B
2026.05
60.8
GPT-4V (0409)
Model=GPT-4V (0409)*
2026.05
60.6
Qwen2-VL-2B
Model=Qwen2-VL-2B*
2026.05
59.5
LLaMA-3.2-11B-Vision
Model=LLaMA-3.2-11B-Vi...
2026.05
59.4
InternVL2-2B
Model=InternVL2-2B*
2026.05
57.7
LLaVA-v1.5-7B
Model=LLaVA-v1.5-7B*
2026.05
48.9
Feedback
Search any
task
Search any
task