Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Natural Language Reasoning on Big-GSM

54.4Accuracy

TCR

52.42452.93753.4553.963Jan 29, 2026
Updated 4d ago

Evaluation Results

MethodLinks
2026.01
54.4
2026.01
53.9
2026.01
52.7
2026.01
52.5