Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Physics Problem Solving on UGPhysics
Loading...
67.6
Accuracy
GPT-5
51.792
55.896
60
64.104
Aug 26, 2025
Accuracy
Delta
Updated 5d ago
Evaluation Results
Method
Method
Links
Accuracy
Delta
GPT-5
Reasoning Effort=High
2025.08
67.6
4
o3
Reasoning Effort=High
2025.08
65.2
2.1
GPT-5
Reasoning Effort=Low
2025.08
63.6
-
o3
Reasoning Effort=Low
2025.08
63.1
-
o4-mini
Reasoning Effort=High
2025.08
62.2
4.5
o3-mini
Reasoning Effort=High
2025.08
60.7
4
o4-mini
Reasoning Effort=Low
2025.08
57.7
-
o3-mini
Reasoning Effort=Low
2025.08
56.7
-
Gemini-2.5-Pro
Reasoning Effort=Low
2025.08
56
-
Gemini-2.5-Pro
Reasoning Effort=High
2025.08
55.4
-0.6
Claude-Sonnet-4
Reasoning Effort=High
2025.08
53.2
0.8
Claude-Sonnet-4
Reasoning Effort=Low
2025.08
52.4
-
Feedback
Search any
task
Search any
task