Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

STEM Reasoning on GPQA (Pass@1)

77.06Pass@1 Accuracy

GPT-OSS-120B

60.47264.778569.08573.3915Apr 10, 2026
Updated 5d ago

Evaluation Results

MethodLinks
2026.04
77.06
2026.04
75.46
2026.04
75.09
2026.04
74.86
2026.04
73.31
2026.04
70.51
2026.04
65.38
2026.04
61.11