Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Mathematical Reasoning on GSM8K (Acc, Cost)
Loading...
93.21
Accuracy
CoT
-2.3452
22.4624
47.27
72.0776
Mar 14, 2026
Accuracy
Cost
Updated 26d ago
Evaluation Results
Method
Method
Links
Accuracy
Cost
CoT
Model=Gemma3
2026.03
93.21
850
CoT
Model=Qwen3-8B
2026.03
89.09
799.7
CoT
Model=Llama3.1
2026.03
87.52
817
ToT
Model=Qwen3-8B
2026.03
3.69
3,175
ToT
Model=Llama3.1
2026.03
3.36
3,033
DST
Model=Qwen3-8B
2026.03
3.35
192.3
ToT
Model=Gemma3
2026.03
2.94
3,650
DST
Model=Llama3.1
2026.03
2.65
241
DST
Model=Gemma3
2026.03
2.63
400
DPTS
Model=Llama3.1
2026.03
1.72
2,506
DPTS
Model=Qwen3-8B
2026.03
1.53
2,670
DPTS
Model=Gemma3
2026.03
1.33
3,050
Feedback
Search any
task
Search any
task