Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Short-answer reasoning on 15 short-answer reasoning problems (Math, Logic, M-QA, Plan, BBH)
Loading...
81
Avg Tokens
Direct
-401.2
2,853.65
6,108.5
9,363.35
Mar 25, 2026
Avg Tokens
Avg Time (s)
Cost ($)
Updated 23d ago
Evaluation Results
Method
Method
Links
Avg Tokens
Avg Time (s)
Cost ($)
Direct
Technique=Direct
2026.03
81
1.8
0.01
CoT
Technique=CoT
2026.03
414
6.4
0.06
Self-Consistency
Technique=Self-Consist...
2026.03
1,236
26.9
0.17
EMoT
Technique=EMoT
2026.03
12,136
183.4
1.64
Feedback
Search any
task
Search any
task