Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Biology Laboratory Reasoning on LabBench
Loading...
74.4
Accuracy
GPT-5
52.56
58.23
63.9
69.57
Aug 26, 2025
Accuracy
Accuracy Delta
Updated 5d ago
Evaluation Results
Method
Method
Links
Accuracy
Accuracy Delta
GPT-5
Reasoning Effort=High
2025.08
74.4
7.8
o3
Reasoning Effort=High
2025.08
74.2
3.7
o3
Reasoning Effort=Low
2025.08
70.5
-
GPT-5
Reasoning Effort=Low
2025.08
66.6
-
Gemini-2.5-Pro
Reasoning Effort=High
2025.08
64.4
2.5
o4-mini
Reasoning Effort=High
2025.08
63.7
4
Gemini-2.5-Pro
Reasoning Effort=Low
2025.08
61.9
-
o4-mini
Reasoning Effort=Low
2025.08
59.7
-
o3-mini
Reasoning Effort=High
2025.08
59.2
2.3
Claude-Sonnet-4
Reasoning Effort=High
2025.08
57.2
3.8
o3-mini
Reasoning Effort=Low
2025.08
56.9
-
Claude-Sonnet-4
Reasoning Effort=Low
2025.08
53.4
-
Feedback
Search any
task
Search any
task