Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Multimodal Math Reasoning on V-Math
Loading...
53
Accuracy
AutoTool (Qwen3-8B)
18.16
27.205
36.25
45.295
Dec 15, 2025
Accuracy
Updated 4d ago
Evaluation Results
Method
Method
Links
Accuracy
AutoTool (Qwen3-8B)
Size=8B, Backbone=Qwen3
2025.12
53
AutoTool (Qwen2.5-VL-7B)
Size=7B, Backbone=Qwen...
2025.12
44.3
GPT4o
Size=-
2025.12
41.4
Qwen2.5-VL-72B-Instruct
Size=72B
2025.12
24.5
v-ToolRL
Size=2B
2025.12
19.5
Feedback
Search any
task
Search any
task