Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Mathematical Reasoning on MATH (pass@1/pass@5)
Loading...
40.4
pass@1
ToT
23.344
27.772
32.2
36.628
Oct 4, 2025
pass@1
pass@5
Updated 3d ago
Evaluation Results
Method
Method
Links
pass@1
pass@5
ToT
Backbone=Llama-3.2-3B-...
2025.10
40.4
56.63
CAA
Backbone=Llama-3.2-3B-...
2025.10
38
60.06
RS
Backbone=Llama-3.2-3B-...
2025.10
37.62
44.78
STaR
Backbone=Llama-3.2-3B-...
2025.10
36.6
46.23
FA
Backbone=Llama-3.2-3B-...
2025.10
29.88
47.98
Base Model
Backbone=Llama-3.2-3B-...
2025.10
24
33.2
Feedback
Search any
task
Search any
task