Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Mathematical Reasoning on MATH Levels 3, 4, 5 (test)
Loading...
96
Accuracy (Level 3)
CPO
77.28
82.14
87
91.86
Feb 2, 2026
Accuracy (Level 3)
Accuracy (Level 4)
Accuracy (Level 5)
Overall Accuracy
Updated 1mo ago
Evaluation Results
Method
Method
Links
Accuracy (Level 3)
Accuracy (Level 4)
Accuracy (Level 5)
Overall Accuracy
CPO
Type=Ours
2026.02
96
92
82
90
Human
Type=Human Prompts
2026.02
95
91
79
88.33
PromptAgent
Type=Automated Prompting
2026.02
95
93
77
88.33
APE
Type=Automated Prompting
2026.02
94
92
82
89.33
OPRO
Type=Automated Prompting
2026.02
94
92
79
88.33
CoT (1-shot)
Type=Conventional Prom...
2026.02
93
92
74
86.33
PromptBreeder
Type=Automated Prompting
2026.02
92
94
80
88.67
TextGrad
Type=Automated Prompting
2026.02
92
87
79
86
Reflexion
Type=Conventional Prom...
2026.02
90
92
77
86.33
DSPy
Type=Automated Prompting
2026.02
78
75
62
71.67
Feedback
Search any
task
Search any
task