Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Reasoning on AIME24

81.5Accuracy

Parallel-Probe

17.0233.7650.567.24Dec 4, 2025Dec 14, 2025Dec 24, 2025Jan 3, 2026Jan 13, 2026Jan 23, 2026Feb 3, 2026
Updated 4d ago

Evaluation Results

MethodLinks
2026.02
81.520,300730,800
2026.02
80.430,100910,800
2026.02
80.4226,000226,000
2026.02
80.484,700459,400
2026.02
8029,300886,800
2026.02
80214,200214,200
2026.02
8098,900528,900
2026.02
8024,800782,200
2026.02
79.719,200688,900
2026.02
76.725,600773,400
2026.02
72.531,4001,025,800
2026.02
72.5170,400909,200
2026.02
72.3482,600482,600
2026.02
68.120,500748,500
2026.02
64.527,300868,200
2025.12
60--
2025.12
56.67--
2026.02
21.820,800773,800
2026.02
21.432,7001,008,600
2026.02
21.4805,500805,500
2026.02
21.4192,900986,700
2026.02
19.526,800820,700