Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Visual Mathematical Reasoning on MathVista 1.0 (testmini)
Loading...
67.6
Accuracy
Qwen2.5-VL-7B
44.72
50.66
56.6
62.54
Mar 3, 2026
Accuracy
Updated 1mo ago
Evaluation Results
Method
Method
Links
Accuracy
Qwen2.5-VL-7B
Backbone=Qwen2.5-VL-7B
2026.03
67.6
Qwen2.5-VL-7B + SFT+GRPO
Backbone=Qwen2.5-VL-7B...
2026.03
66.5
Qwen2.5-VL-7B + SFT
Backbone=Qwen2.5-VL-7B...
2026.03
66.3
Qwen2.5-VL-7B + SFT+DPO
Backbone=Qwen2.5-VL-7B...
2026.03
65.5
Qwen2.5-VL-7B + SFT+DPO+GRPO
Backbone=Qwen2.5-VL-7B...
2026.03
65.2
Qwen2.5-VL-7B + SPA-VL
Backbone=Qwen2.5-VL-7B...
2026.03
64
Qwen2.5-VL-7B + SaFeR-VLM
Backbone=Qwen2.5-VL-7B...
2026.03
63.9
Qwen2.5-VL-3B + SFT
Backbone=Qwen2.5-VL-3B...
2026.03
55.3
Qwen2.5-VL-3B
Backbone=Qwen2.5-VL-3B
2026.03
54.3
Qwen2.5-VL-3B + SFT+DPO
Backbone=Qwen2.5-VL-3B...
2026.03
54
Qwen2.5-VL-3B + VLGuard
Backbone=Qwen2.5-VL-3B...
2026.03
53.8
Qwen2.5-VL-3B + SPA-VL
Backbone=Qwen2.5-VL-3B...
2026.03
53.3
Qwen2.5-VL-7B + TIS
Backbone=Qwen2.5-VL-7B...
2026.03
53.2
Qwen2.5-VL-7B + VLGuard
Backbone=Qwen2.5-VL-7B...
2026.03
52.7
Qwen2.5-VL-3B + SaFeR-VLM
Backbone=Qwen2.5-VL-3B...
2026.03
52.6
Qwen2.5-VL-3B + SFT+GRPO
Backbone=Qwen2.5-VL-3B...
2026.03
52
Qwen2.5-VL-3B + SFT+DPO+GRPO
Backbone=Qwen2.5-VL-3B...
2026.03
51.9
Qwen2.5-VL-3B + TIS
Backbone=Qwen2.5-VL-3B...
2026.03
45.6
Feedback
Search any
task
Search any
task