Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Olympiad-level Science Problem Solving on OlympBench

69.6Accuracy

Gemini-2.5-Pro

38.29646.42354.5562.677Aug 26, 2025
Updated 5d ago

Evaluation Results

MethodLinks
2025.08
69.62.1
2025.08
67.5-
2025.08
64.94.8
2025.08
60-
2025.08
59.84.4
2025.08
584.5
2025.08
55.4-
2025.08
53.5-
2025.08
51.111.6
2025.08
49.69.2
2025.08
40.4-
2025.08
39.5-