Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Mathematical Reasoning on MATH Levels 3, 4, 5 (test)
Loading...
96
Accuracy (Level 3)
CPO
77.28
82.14
87
91.86
Feb 2, 2026
Accuracy (Level 3)
Accuracy (Level 4)
Accuracy (Level 5)
Overall Accuracy
Updated 4d ago
Evaluation Results
Method
Method
Links
Accuracy (Level 3)
Accuracy (Level 4)
Accuracy (Level 5)
Overall Accuracy
CPO
Type=Ours
2026.02
96
92
82
90
Human
Type=Human Prompts
2026.02
95
91
79
88.33
PromptAgent
Type=Automated Prompting
2026.02
95
93
77
88.33
APE
Type=Automated Prompting
2026.02
94
92
82
89.33
OPRO
Type=Automated Prompting
2026.02
94
92
79
88.33
CoT (1-shot)
Type=Conventional Prom...
2026.02
93
92
74
86.33
PromptBreeder
Type=Automated Prompting
2026.02
92
94
80
88.67
TextGrad
Type=Automated Prompting
2026.02
92
87
79
86
Reflexion
Type=Conventional Prom...
2026.02
90
92
77
86.33
DSPy
Type=Automated Prompting
2026.02
78
75
62
71.67
Feedback
Search any
task
Search any
task