Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Multimodal mathematical reasoning on MathVista (test)

74.2Accuracy

Qwen-VL-Max

57.5661.8866.270.52Jan 7, 2026Jan 20, 2026Feb 2, 2026Feb 16, 2026Mar 1, 2026Mar 14, 2026Mar 28, 2026
Updated 18d ago

Evaluation Results

MethodLinks
2026.01
74.2-
2026.01
74.2115.3
2026.01
73.8120.7
2026.01
73.5-
2026.03
72.155.54
2026.03
71.851.98
2026.03
71.2153.06
2026.03
71.0751.17
2026.01
71185.6
2026.03
70.554.14
2026.03
70.4350.53
2026.01
70.2305.7
2026.03
69.4251.4
2026.03
69.451.8
2026.03
69.3752.94
2026.03
69.351
2026.03
69.250.79
2026.03
69.0851.85
2026.03
68.4250.29
2026.03
68.3750.18
2026.01
68.2189.1
2026.01
67.7-
2026.03
67.6849.37
2026.03
67.4747.68
2026.03
66.8747.82
2026.01
66.8145.2
2026.01
66.2158.7
2026.01
64.1402.5
2026.01
63.8-
2026.01
63.2245
2026.01
62.3212.9
2026.01
62.1275
2026.01
61.995.9
2026.01
58.2265.9