Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Reasoning on Average (MATH500, AIME24, AIME25, GPQA_diamond)

58.71Accuracy

InftyThink+

41.009245.604650.254.7954Feb 6, 2026
Updated 4d ago

Evaluation Results

MethodLinks
2026.02
58.7115.69265.17
2026.02
57.1313.63534.57
2026.02
56.6610.32154.62
2026.02
53.9620.02100.21
2026.02
53.8310.64320.73
2026.02
53.6711.35186.3
2026.02
50.5810.6648.37
2026.02
47.3114.45149.44
2026.02
44.0614.2477.57
2026.02
41.6912.1110.96