Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Mathematical Reasoning on AMC (Pass@16, Mean@16, Token Usage)
Loading...
39.7
Mean @16
OptPO-SFT
17.86
23.53
29.2
34.87
Dec 2, 2025
Mean @16
Pass @16
Pass @1
Total Tokens (M)
Token Saving
Updated 4d ago
Evaluation Results
Method
Method
Links
Mean @16
Pass @16
Pass @1
Total Tokens (M)
Token Saving
OptPO-SFT
Backbone=Qwen2.5-Math-...
2025.12
39.7
81.9
43.4
-
12.08
TTSFT
Backbone=Qwen2.5-Math-...
2025.12
37.9
80.7
42.2
-
-
TTSFT
Backbone=Llama-3.1-8B-...
2025.12
20.1
63.9
28.9
-
-
OptPO-SFT
Backbone=Llama-3.1-8B-...
2025.12
18.7
60.2
19.3
-
17.16
Feedback
Search any
task
Search any
task