Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Mathematical and Scientific Reasoning on Combined Suite (AIME, HMMT, GPQA, MMLU-Pro, MMLU-Redux 2.0)

89.5Pass@1

GPT-OSS-120B

79.110481.807784.50587.2023Apr 10, 2026
Updated 5d ago

Evaluation Results

MethodLinks
89.53,661.4524.44
2026.04
89.424,299.4520.8
89.134,137.2721.54
2026.04
88.853,616.7724.57
87.642,214.3539.58
84.954,859.7917.48
83.486,132.913.61
2026.04
79.515,222.7115.22