Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Visual Mathematical Reasoning on MathVista (BoN@8)
Loading...
83.5
BoN@8 Accuracy
InternVL2.5-38B + EVPV-PRM
58.3736
64.8968
71.42
77.9432
Mar 17, 2026
Mar 18, 2026
Mar 19, 2026
Mar 20, 2026
Mar 21, 2026
Mar 22, 2026
Mar 24, 2026
BoN@8 Accuracy
Delta (BoN@8 - Pass@1)
Updated 24d ago
Evaluation Results
Method
Method
Links
BoN@8 Accuracy
Delta (BoN@8 - Pass@1)
InternVL2.5-38B + EVPV-PRM
Policy Model=InternVL2...
2026.03
83.5
11.6
InternVL2.5-26B + EVPV-PRM
Policy Model=InternVL2...
2026.03
79.6
11.4
InternVL2.5-8B + EVPV-PRM
Policy Model=InternVL2...
2026.03
76.3
11.8
InternVL2.5-38B + VisualPRM
Policy Model=InternVL2...
2026.03
73.9
2
InternVL2.5-26B + VisualPRM
Policy Model=InternVL2...
2026.03
73.1
4.9
InternVL2.5-38B
Policy Model=InternVL2...
2026.03
71.9
-
Gemini-2.0-Flash
Reranking Strategy=Bes...
2026.03
70.4
-
InternVL2.5-8B + VisualPRM
Policy Model=InternVL2...
2026.03
68.5
4
InternVL2.5-26B
Policy Model=InternVL2...
2026.03
68.2
-
Claude-3.5-Sonnet
Reranking Strategy=Bes...
2026.03
65.3
-
InternVL2.5-8B
Policy Model=InternVL2...
2026.03
64.5
-
PEPO_G
Backbone=Qwen2.5-VL-3B...
2026.03
63.48
-
PAPO_G
Backbone=Qwen2.5-VL-3B...
2026.03
61.38
-
GPT-4o
Reranking Strategy=Bes...
2026.03
60
-
GRPO
Backbone=Qwen2.5-VL-3B...
2026.03
59.34
-
Feedback
Search any
task
Search any
task