Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

General Reasoning on GPQA (Acc, Cost)

48.13Accuracy

CoT

1.340413.487725.63537.7823Mar 14, 2026
Updated 26d ago

Evaluation Results

MethodLinks
2026.03
48.134,926
2026.03
44.84,089
2026.03
44.064,156
2026.03
5.374,940
2026.03
4.94,141
2026.03
4.483,994
2026.03
4.110,602
2026.03
3.768,875
2026.03
3.758,678
2026.03
3.448,336
2026.03
3.277,001
2026.03
3.146,892