Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Multimodal Reasoning on MathVista (Accuracy, AvgLen, Ratio)

85.9Accuracy

Qwen3-VL-32B-Thinking

33.79647.32360.8574.377Sep 30, 2025Oct 30, 2025Nov 30, 2025Dec 31, 2025Jan 31, 2026Mar 3, 2026Apr 3, 2026
Updated 4d ago

Evaluation Results

MethodLinks
2026.01
85.9--
2026.01
81.4--
2026.01
80.4--
2026.04
78.1--
2026.04
76.2--
2026.04
75.9--
2026.01
75.8--
2026.04
75.1--
2026.03
75--
2026.03
74.5--
2026.03
74.2--
2026.04
74--
2026.04
73.8--
2025.11
73.7--
2026.03
73.2--
2026.03
72.7--
2025.11
72.6--
2025.12
71.9--
2025.11
71.9--
2025.11
71.7--
2025.11
71.6--
2025.11
71.6--
2026.03
71.5--
2026.03
71.5--
2026.03
71.4--
2026.03
71.3--
2025.12
71.2--
2025.12
71.1--
2026.01
70.9--
2026.03
70.9--
2026.03
70.5--
2025.09
70.4--
2025.09
70.4--
2026.03
70.3--
2025.11
70.1--
2025.09
70.1--
2025.12
70--
2025.12
70--
2025.11
70--
2025.09
70--
2026.01
69.4--
2025.12
69.2--
2025.09
69.1--
2026.03
68.8--
2025.12
68.5--
2025.11
68.3--
2025.09
68.2--
2025.12
68--
2026.03
67.5--
2025.12
67.2--
2025.09
66.8--
2025.09
65.1--
2025.09
64.7--
2026.03
63.9--
2025.09
63.7--
2025.12
63--
2025.12
62.9--
2025.12
61.7--
2025.12
59.7--
2025.12
59.5--
2025.11
58.6--
2025.09
57.6--
2025.09
55.1--
2025.12
47.7--
2026.02
45.9299.56.5
2026.02
45.7511.611.2
2026.02
45.3912
2026.02
43.9251.35.7
2026.02
37.689.72.4
2026.02
37.3324.28.7
2026.02
36.8232.26.3
2026.02
35.8524.114.6