Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Mathematical Reasoning on MATH 500 (Pass@1, FLOPS)
Loading...
38.2
Pass@1
ϕ-Decoding
9.704
17.102
24.5
31.898
Jan 21, 2026
Pass@1
FLOPS
Updated 4d ago
Evaluation Results
Method
Method
Links
Pass@1
FLOPS
ϕ-Decoding
Backbone=LLaMA3.1-8B-I...
2026.01
38.2
-
MFS (Ours)
Backbone=LLaMA3.1-8B-I...
2026.01
38.2
-
MCTS
Backbone=LLaMA3.1-8B-I...
2026.01
34.4
-
Predictive Decoding
Backbone=LLaMA3.1-8B-I...
2026.01
34
-
Tree-of-Thoughts
Backbone=LLaMA3.1-8B-I...
2026.01
31.6
-
Guided Decoding
Backbone=LLaMA3.1-8B-I...
2026.01
31.2
-
Auto-Regressive (CoT)
Backbone=LLaMA3.1-8B-I...
2026.01
31
-
MFS (Ours)
Backbone=Mistral-v0.3-...
2026.01
17.6
-
ϕ-Decoding
Backbone=Mistral-v0.3-...
2026.01
16.4
-
Auto-Regressive (CoT)
Backbone=Mistral-v0.3-...
2026.01
12.2
-
Predictive Decoding
Backbone=Mistral-v0.3-...
2026.01
11
-
Tree-of-Thoughts
Backbone=Mistral-v0.3-...
2026.01
10.8
-
MCTS
Backbone=Mistral-v0.3-...
2026.01
10.8
-
Guided Decoding
Backbone=Mistral-v0.3-...
2026.01
10.8
-
Feedback
Search any
task
Search any
task