Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Mathematical Reasoning on WeMath

80.6Accuracy

Gemini 2.5 Pro

43.430453.080262.7372.3798Sep 30, 2025Oct 30, 2025Nov 30, 2025Dec 30, 2025Jan 30, 2026Mar 1, 2026Apr 1, 2026
Updated 16d ago

Evaluation Results

MethodLinks
2026.02
80.6-
2025.09
78-
2026.04
7882.7
2026.03
75.51-
2026.01
73.79-
73.05-
2026.03
72.6-
2026.02
71.49-
2026.03
71.3-
2025.09
71.1-
2026.04
71.178.4
2026.01
70.86-
2026.02
70.57-
2026.03
70.57-
2026.03
70.06-
2026.03
69.77-
2026.02
69.4-
2026.02
69.1-
2026.02
69-
2026.03
69-
2026.02
68.9-
2026.02
68.8-
2026.03
68.7-
2026.02
68.61-
2026.02
68.22-
2026.03
67.4-
2026.01
67.01-
2026.02
66.63-
2026.02
66.61-
2026.01
66.03-
2026.02
65.57-
2026.02
65.52-
2026.02
65.34-
2026.01
65-
2026.02
64.98-
2026.02
64.83-
2026.03
64.77-
2026.03
64.6-
2026.02
64.48-
2026.03
64.48-
2026.02
63.97-
2026.03
63.62-
2026.02
62.9-
2026.02
62.5-
2026.02
62.1-
2026.03
62.1-
2026.02
62.01-
2026.02
61.88-
2026.02
61.8-
2026.02
61.6-
2026.01
61.21-
2026.02
60.75-
2025.10
60.7-
2026.03
60.29-
2026.03
60.1-
2025.10
59.8-
2026.02
59.32-
2026.02
59.2-
2026.03
58.7-
2026.02
58.52-
2026.02
58.3-
2026.03
57.81-
2026.02
57.59-
2025.10
57-
2026.02
56.8-
2026.03
56.48-
2026.02
55.63-
2026.01
55.63-
2025.10
55-
2026.02
54.7-
2026.02
54.66-
2026.02
54.65-
2026.03
54.57-
2026.03
54.19-
2026.01
53.97-
2026.02
53.62-
2026.02
53.5-
2026.03
53.41-
2026.03
53.05-
2026.02
52.64-
2026.03
52.29-
2026.02
51.78-
2026.02
51.7-
2026.01
50.34-
2026.03
50-
2026.02
49.2-
2026.03
49.14-
2026.03
49.14-
2026.03
48.86-
2025.12
48.5-
2026.03
48-
2026.03
47.81-
2026.01
47.4-
2026.03
46.1-
2026.03
46.1-
2026.01
45.8-
2026.03
45.43-
2026.01
45.4-
2026.02
44.89-
2026.03
44.86-
Showing 100 of 161 rows