Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Multimodal Reasoning on MathVista (Accuracy, AvgLen, Ratio)

85.9Accuracy

Qwen3-VL-32B-Thinking

33.79647.32360.8574.377Dec 20, 2025Dec 28, 2025Jan 6, 2026Jan 15, 2026Jan 23, 2026Feb 1, 2026Feb 10, 2026
Updated 3d ago

Evaluation Results

MethodLinks
2026.01
85.9--
2026.01
81.4--
2026.01
80.4--
2026.01
75.8--
2025.12
71.9--
2025.12
71.2--
2025.12
71.1--
2026.01
70.9--
2025.12
70--
2025.12
70--
2026.01
69.4--
2025.12
69.2--
2025.12
68.5--
2025.12
68--
2025.12
67.2--
2025.12
63--
2025.12
62.9--
2025.12
61.7--
2025.12
59.7--
2025.12
59.5--
2025.12
47.7--
2026.02
45.9299.56.5
2026.02
45.7511.611.2
2026.02
45.3912
2026.02
43.9251.35.7
2026.02
37.689.72.4
2026.02
37.3324.28.7
2026.02
36.8232.26.3
2026.02
35.8524.114.6