Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Mathematical Reasoning on GSM8K (maj@k)
Loading...
33.21
Accuracy (maj@2)
DLE
17.9116
21.8833
25.855
29.8267
Apr 22, 2026
Accuracy (maj@2)
Accuracy (maj@4)
Accuracy (maj@8)
Updated 1mo ago
Evaluation Results
Method
Method
Links
Accuracy (maj@2)
Accuracy (maj@4)
Accuracy (maj@8)
DLE
Model=Llama3.2-1B-Inst...
2026.04
33.21
36.01
38.44
DLE
Model=Llama3.2-1B-Inst...
2026.04
33.21
36.01
38.06
DLE
Model=Llama3.2-1B-Inst...
2026.04
33.13
34.8
38.44
DLE
Model=Llama3.2-1B-Inst...
2026.04
33.06
36.01
38.06
DLE
Model=Llama3.2-1B-Inst...
2026.04
31.39
34.65
36.85
Self-consistency
Model=Llama3.2-1B-Inst...
2026.04
30.1
35.25
40.25
Self-consistency
Model=Llama3.2-1B-Inst...
2026.04
27.75
33.89
39.95
Self-consistency
Model=Llama3.2-1B-Inst...
2026.04
27.6
33.21
39.65
Self-consistency
Model=Llama3.2-1B-Inst...
2026.04
18.5
24.64
32.6
Feedback
Search any
task
Search any
task