Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Step-wise Verification on MathVision
Loading...
61.7
Macro F1
Qwen3-VL
42.252
47.301
52.35
57.399
Nov 28, 2025
Macro F1
Updated 2d ago
Evaluation Results
Method
Method
Links
Macro F1
Qwen3-VL
Model Type=Open-source...
2025.11
61.7
GPT-4o
Model Type=Proprietary
2025.11
60.2
Gemini-2.0-Flash
Model Type=Proprietary
2025.11
60.1
Qwen2.5-VL
Model Type=Open-source...
2025.11
59
GPT-4o-Mini
Model Type=Proprietary
2025.11
58.9
TIM-PRM
Model Type=Open-source...
2025.11
58.3
TIM-PRM
Model Type=Open-source...
2025.11
57.6
VisualPRM
Model Type=Open-source...
2025.11
56.1
MM-PRM
Model Type=Open-source...
2025.11
55.4
Qwen2.5-VL
Model Type=Open-source...
2025.11
51.8
InternVL2.5
Model Type=Open-source...
2025.11
51.7
MiniCPM-V2.6
Model Type=Open-source...
2025.11
50.9
Qwen3-VL
Model Type=Open-source...
2025.11
50.9
LLaVA-OV
Model Type=Open-source...
2025.11
48.4
InternVL2.5
Model Type=Open-source...
2025.11
48.4
InternVL2.5
Model Type=Open-source...
2025.11
47.4
InternVL2.5
Model Type=Open-source...
2025.11
45.5
LLaVA-OV
Model Type=Open-source...
2025.11
43
Feedback
Search any
task
Search any
task