Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Mathematical Reasoning on HMMT25 (Accuracy, Token count (x10^7))

86.7Accuracy

Majority

10.57230.33650.169.864Jan 28, 2026Jan 30, 2026Feb 2, 2026Feb 5, 2026Feb 7, 2026Feb 10, 2026Feb 13, 2026
Updated 4d ago

Evaluation Results

MethodLinks
2026.02
86.7----
2026.02
86.7----
2026.02
85.3-0.74--
2026.02
83.3----
2026.02
82.70.76---
2026.02
82.7----
2026.02
82.634.3---
2026.02
82.6----
2026.02
82----
2026.02
81.30.73---
2026.02
81.3----
2026.02
81.3----
2026.02
80.7----
2026.02
80.7----
2026.02
80.7----
2026.02
80.7-0.76--
2026.02
80--355.6-
2026.02
80--181.9-49
2026.02
78.7-0.77--
2026.02
78-1.76--
2026.02
78--177-50
2026.02
77.3--290.4-18
2026.02
76.7-1--
2026.02
74-0.84--
2026.02
73.3--355.6-
2026.02
72.6----
2026.02
72----
2026.02
71.3-1.78--
2026.02
70-1--
2026.02
69.4-1--
2026.02
69.344.9---
2026.02
69.3----
2026.02
68.7-1.03--
2026.02
66.7--641.4-50
2026.02
660.6---
2026.02
66----
2026.02
64.70.4---
2026.02
64.7----
2026.02
63.327.6---
2026.02
63.3----
2026.02
63.3--1,275.7-
2026.02
63.3--1,165.7-9
2026.02
63.3--641.4-50
2026.02
62.62.24---
2026.02
62.6----
2026.02
62.6--355.6-
2026.02
62----
2026.02
61.3----
2026.02
60.7----
2026.02
60----
2026.02
58----
2026.02
55.7--1,275.7-
2026.02
52.7--1,275.7-
2026.02
52----
2026.02
48.90.1267-34,800-
2026.02
48.90.1062-184,500-
2026.02
48.80.0565-565,100-
2026.02
47.10.0897-22,400-
2026.02
44.70.0872-21,500-
2026.02
43.60.1168-33,900-
2026.02
43.60.0487-487,300-
2026.02
43.60.099-174,000-
2026.02
42.70.0897-28,500-
2026.02
41.90.0863-27,100-
2026.02
24.20.1133-32,400-
2026.02
24.20.0586-586,300-
2026.02
24.20.1014-174,900-
2026.02
22.60.086-22,800-
2026.02
21.40.0889-26,900-
2026.02
18.50.0735-20,500-
2026.02
18.10.0938-31,000-
2026.02
18.10.0581-580,800-
2026.02
18.10.0924-179,500-
2026.02
17.40.0808-26,300-
2026.01
16.2----
2026.01
15----
2026.01
14.2----
2026.01
13.5----