Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

SuperGPQA

Benchmarks

Task NameDataset NameSOTA ResultTrend
Scientific ReasoningSuperGPQA
Mean@150.1
34
Knowledge-intensive reasoningSuperGPQA
Overall Score63.4
31
Scientific ReasoningSuperGPQA (test)
Pass@177.7
25
ReasoningSuperGPQA
Length7,262
24
ReasoningsuperGPQA
Accuracy (superGPQA)39.3
24
Code ReasoningSuperGPQA Code SGPQA-1k
Accuracy47.4
24
Math ReasoningSuperGPQA SGPQA-1k Math
Accuracy46.5
24
Multiple Choice Question AnsweringSuperGPQA MCQA
Accuracy63.83
21
General KnowledgeSuperGPQA
pass@148.2
19
General Knowledge QAsuperGPQA
Average Accuracy33.6
18
ReasoningSuperGPQA
Pass@138.4
17
General ReasoningSuperGPQA
Accuracy (General Reasoning)35.7
15
General ReasoningSuperGPQA
Avg@k28.46
15
Question AnsweringSuperGPQA (test)
Accuracy40.2
15
Scientific ReasoningSuperGPQA
Accuracy44.7
15
Graduate-level ReasoningSuperGPQA
Pass@136
14
LLM RoutingSUPERGPQA (val)
Top-1 Acc0.776
14
LLM RoutingSUPERGPQA
Top-1 Acc77.6
14
Scientific Question AnsweringSuperGPQA*
Accuracy62.4
12
Science Question AnsweringSuperGPQA
Avg@k28.46
12
Medical Question AnsweringSuperGPQA Clinical
Accuracy35.71
12
Question AnsweringSuperGPQA Law
Accuracy43.8
10
Question AnsweringSuperGPQA
Pass@134.51
8
Science Question AnsweringSuperGPQA
Score35.59
8
Medical Question AnsweringSuperGPQA
Accuracy27.66
8
Showing 25 of 37 rows