Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Biology Laboratory Reasoning on LabBench

74.4Accuracy

GPT-5

52.5658.2363.969.57Aug 26, 2025
Updated 5d ago

Evaluation Results

MethodLinks
2025.08
74.47.8
2025.08
74.23.7
2025.08
70.5-
2025.08
66.6-
2025.08
64.42.5
2025.08
63.74
2025.08
61.9-
2025.08
59.7-
2025.08
59.22.3
2025.08
57.23.8
2025.08
56.9-
2025.08
53.4-