Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Mathematical Reasoning on BRUMO25

92.6Accuracy

DeepConf

26.112843.373960.63577.8961Feb 9, 2026Feb 10, 2026Feb 11, 2026
Updated 4d ago

Evaluation Results

MethodLinks
2026.02
92.621.7-
2026.02
92.60.44-
2026.02
92.621.7-
2026.02
92.6--
2026.02
92.6--
2026.02
92.6--
2026.02
92.6--
2026.02
92.6--
2026.02
9235.6-
2026.02
921.37-
2026.02
92--
2026.02
92--
2026.02
92--
2026.02
91.3--
2026.02
90.7--
2026.02
90.7--
2026.02
90.7--
2026.02
90.7--
2026.02
90.7--
2026.02
90.7--
2026.02
90.7--
2026.02
90.7--
2026.02
90.60.28-
2026.02
90.6--
2026.02
86.70.4-
2026.02
86.70.18-
2026.02
86.7--
2026.02
86.7--
2026.02
86.7--
2026.02
80.7--
2026.02
80--
2026.02
78.7--
2026.02
43.33--
2026.02
36.67--
2026.02
33.33--
2026.02
30--
2026.02
28.67--
2026.02
--61.67
2026.02
--67.08
2026.02
--65
2026.02
--69.58
2026.02
--70
2026.02
--72.5
2026.02
--72.84
2026.02
--73.64
2026.02
--75