Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Math Reasoning on Math Reasoning 1.5B model (val)
Loading...
69.4
Validation Accuracy
Execution-Guided Search
47.144
52.922
58.7
64.478
Jan 20, 2026
Validation Accuracy
Updated 4d ago
Evaluation Results
Method
Method
Links
Validation Accuracy
Execution-Guided Search
model_size=1.5B, searc...
2026.01
69.4
Best Human Expert
model_size=1.5B
2026.01
68.8
Baseline
model_size=1.5B, metho...
2026.01
48
Feedback
Search any
task
Search any
task