Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Fine-grained visual reasoning on MME Realworld Lite
Loading...
55.8
Avg@1
MAPO
42.488
45.944
49.4
52.856
Apr 8, 2026
Avg@1
Updated 9d ago
Evaluation Results
Method
Method
Links
Avg@1
MAPO
backbone=Ovis2.5-9B
2026.04
55.8
Ovis2.5-9B + GRPO
training=GRPO
2026.04
55.5
GPT-5
2026.04
55.3
Thyme
results_source=origina...
2026.04
55.2
Ovis2.5-9B + GSPO
training=GSPO
2026.04
54.9
Ovis2.5-9B + DAPO
training=DAPO
2026.04
50.3
Mini-o3
reproduced_by_authors=...
2026.04
49.4
Gemini 2.5 Pro
2026.04
49.2
DeepEyes
reproduced_by_authors=...
2026.04
48.4
Ovis2.5-9B + Coldstart SFT
training=Supervised Fi...
2026.04
47.9
Ovis2.5-9B
2026.04
46.1
Ovis2.5-9B + PPO
training=PPO
2026.04
43
Feedback
Search any
task
Search any
task