Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
General Reasoning on GPQA (Acc, Cost)
Loading...
48.13
Accuracy
CoT
1.3404
13.4877
25.635
37.7823
Mar 14, 2026
Accuracy
Cost
Updated 26d ago
Evaluation Results
Method
Method
Links
Accuracy
Cost
CoT
Model=Gemma3
2026.03
48.13
4,926
CoT
Model=Qwen3
2026.03
44.8
4,089
CoT
Model=Llama3.1
2026.03
44.06
4,156
DST
Model=Gemma3
2026.03
5.37
4,940
DST
Model=Qwen3
2026.03
4.9
4,141
DST
Model=Llama3.1
2026.03
4.48
3,994
ToT
Model=Gemma3
2026.03
4.1
10,602
ToT
Model=Qwen3
2026.03
3.76
8,875
ToT
Model=Llama3.1
2026.03
3.75
8,678
DPTS
Model=Gemma3
2026.03
3.44
8,336
DPTS
Model=Qwen3
2026.03
3.27
7,001
DPTS
Model=Llama3.1
2026.03
3.14
6,892
Feedback
Search any
task
Search any
task