Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Mathematical Reasoning on AIME 24 (Acc@8, Pass@8)
Loading...
41.6
Accuracy@8
PACED Forward KL
-1.664
9.568
20.8
32.032
Mar 10, 2026
Mar 11, 2026
Accuracy@8
Pass Rate@8
Updated 1mo ago
Evaluation Results
Method
Method
Links
Accuracy@8
Pass Rate@8
PACED Forward KL
Distillation track=Qwe...
2026.03
41.6
-
AKL
Distillation track=Qwe...
2026.03
39.8
-
Hard Filter Forward KL
Distillation track=Qwe...
2026.03
39.5
-
Forward KL (unweighted)
Distillation track=Qwe...
2026.03
35.9
-
Base
Distillation track=Qwe...
2026.03
28.7
-
TTC-Net
Backbone=Llama-3-Instr...
2026.03
3.33
20
+ RetNet
Backbone=Llama-3-Instr...
2026.03
2.5
13.33
Full Finetuning
Backbone=Llama-3-Instr...
2026.03
1.67
6.67
+ MesaNet
Backbone=Llama-3-Instr...
2026.03
1.25
10
+ Mamba
Backbone=Llama-3-Instr...
2026.03
0.83
3.33
+ Attention
Backbone=Llama-3-Instr...
2026.03
0.42
3.33
+ GDN
Backbone=Llama-3-Instr...
2026.03
0.42
3.33
Base model
Backbone=Llama-3-Instr...
2026.03
0
0
+ MLP
Backbone=Llama-3-Instr...
2026.03
0
0
Feedback
Search any
task
Search any
task