Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

General Visual Reasoning on MMStar (Accuracy)

77.5Accuracy

Gemini 2.5 Pro

44.53253.09161.6570.209Sep 26, 2025Oct 18, 2025Nov 10, 2025Dec 3, 2025Dec 25, 2025Jan 17, 2026Feb 9, 2026
Updated 19d ago

Evaluation Results

MethodLinks
2026.02
77.5-
2025.09
66.3-
2026.02
65.27-
2026.02
64.7-
2026.01
64.7-
2025.09
64.753.2
2025.09
6452.52
2026.02
63.67-
2026.02
63.53-
2026.02
63.47-
2026.01
63.2-
2026.02
63.07-
2026.02
63.07-
2025.09
6355.82
2026.02
62.73-
2025.09
62.755.76
2026.01
62.67-
2026.02
62.6-
2025.09
62.654.2
2026.02
61.53-
2025.09
61.353.72
2026.02
60.8-
2026.01
60.33-
2025.09
60.350.9
2026.02
60.27-
2026.01
60.27-
2026.01
59.7-
2025.09
59.652.04
2025.09
59.350.84
2025.09
59.350.94
2026.01
59.13-
2026.01
58.73-
2026.02
58.33-
2026.01
57.93-
2026.02
57-
2026.02
55.93-
2025.09
5548.8
2025.09
54.846.7
2025.09
54.350.53
2026.02
54.2-
2026.02
53.73-
2026.01
53.33-
2025.09
53.145.74
2025.09
52.745.8
2025.09
51.445.7
2026.01
45.8-