Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Mathematical Reasoning on GSM8K Zero (test) (ACC, Output Tokens)
Loading...
78.43
Accuracy
TALE-PT-SFT
64.5044
68.1197
71.735
75.3503
Dec 24, 2024
Accuracy
Avg Output Tokens
Updated 4d ago
Evaluation Results
Method
Method
Links
Accuracy
Avg Output Tokens
TALE-PT-SFT
LLM=Llama-3.1-8B-Instruct
2024.12
78.43
77.85
TALE-PT-DPO
LLM=Llama-3.1-8B-Instruct
2024.12
78.41
113.41
Directly Answering
LLM=Llama-3.1-8B-Instruct
2024.12
70.32
13.49
Vanilla CoT
LLM=Llama-3.1-8B-Instruct
2024.12
65.04
251.08
Feedback
Search any
task
Search any
task