Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Geometric Problem Solving on GeoTrust Tier1 (test)
Loading...
37
Count
OpenAI-o3
1.64
10.82
20
29.18
Apr 22, 2025
Count
Accuracy
Updated 23d ago
Evaluation Results
Method
Method
Links
Count
Accuracy
OpenAI-o3
#Params=-
2025.04
37
61.67
Intern-S1
#Params=235B+6B
2025.04
35
58.33
Gemini-2.5-pro
#Params=-
2025.04
34
56.67
Claude-3.7-sonnet
#Params=-
2025.04
33
55
Qwen2.5-VL-72B
#Params=72B
2025.04
32
53.33
GPT-4o
#Params=-
2025.04
31
51.67
DeepSeek-R1
#Params=671B, input_fo...
2025.04
30
50
Qwen2-VL-7B
#Params=7B
2025.04
5
8.33
LLaVA-1.5-7B
#Params=7B
2025.04
3
5
Feedback
Search any
task
Search any
task