Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Multimodal Mathematical Reasoning on LogicVista

75.2Accuracy

Gemini-2.5-Pro-Thinking

26.933639.464351.99564.5257Dec 19, 2025Jan 13, 2026Feb 7, 2026Mar 4, 2026Mar 29, 2026Apr 23, 2026May 19, 2026
Updated 14d ago

Evaluation Results

MethodLinks
2025.12
75.2
2025.12
71.8
2025.12
64.4
2025.12
63.8
2026.05
61.16
2026.05
56.92
2026.05
55.58
2026.05
54.98
2026.05
54.91
2025.12
52.3
2026.05
52.01
2025.12
50.9
2025.12
50.8
2025.12
50.2
2025.12
49.9
2026.03
49.7
2026.03
49.6
2026.03
49.4
2026.03
49.2
2026.03
49.1
2026.03
49
2026.03
48.8
2026.03
48.6
2025.12
48.5
2026.03
48.1
2026.03
47.9
2026.03
47.9
2026.03
47.8
2026.03
47.3
2026.03
47.2
2026.03
46.9
2026.03
46.9
2025.12
46.8
2026.03
46.5
2026.03
46.3
2026.03
46.3
2025.12
45
2025.12
44.5
2025.12
41.4
2025.12
40.7
2026.05
37.72
2026.05
37.5
2026.05
36.38
2026.05
32.81
2026.05
29.46
2026.05
28.79