Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Scientific Multimodal Reasoning on ScienceOlympiad

41.38Accuracy

GPT-5

0.393611.034321.67532.3157Apr 23, 2026
Updated 1mo ago

Evaluation Results

MethodLinks
2026.04
41.38
2026.04
36.47
2026.04
36.47
2026.04
33.5
2026.04
33
28.57
24.13
2026.04
22.35
2026.04
11.82
2026.04
1.97