Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Visual Mathematical Reasoning on MathVerse VO
Loading...
47.8
BoN@8 Accuracy
Gemini-2.0-Flash
21.8
28.55
35.3
42.05
Mar 17, 2026
BoN@8 Accuracy
Delta_8 (BoN@8 - Pass@1)
Updated 1mo ago
Evaluation Results
Method
Method
Links
BoN@8 Accuracy
Delta_8 (BoN@8 - Pass@1)
Gemini-2.0-Flash
Reranking Strategy=Bes...
2026.03
47.8
-
InternVL2.5-38B + EVPV-PRM
Policy Model=InternVL2...
2026.03
47.67
10.77
InternVL2.5-38B + VisualPRM
Policy Model=InternVL2...
2026.03
46.7
9.8
Claude-3.5-Sonnet
Reranking Strategy=Bes...
2026.03
46.3
-
GPT-4o
Reranking Strategy=Bes...
2026.03
40.6
-
InternVL2.5-26B + VisualPRM
Policy Model=InternVL2...
2026.03
39.1
15.1
InternVL2.5-38B
Policy Model=InternVL2...
2026.03
36.9
-
InternVL2.5-8B + VisualPRM
Policy Model=InternVL2...
2026.03
35.8
13
InternVL2.5-26B + EVPV-PRM
Policy Model=InternVL2...
2026.03
32.47
8.47
InternVL2.5-8B + EVPV-PRM
Policy Model=InternVL2...
2026.03
29.47
6.67
InternVL2.5-26B
Policy Model=InternVL2...
2026.03
24
-
InternVL2.5-8B
Policy Model=InternVL2...
2026.03
22.8
-
Feedback
Search any
task
Search any
task