Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Mathematical Reasoning on Geometry3k

79.9Accuracy

GPT-5-Thinking*

36.84448.02259.270.378Sep 30, 2025Nov 1, 2025Dec 3, 2025Jan 4, 2026Feb 5, 2026Mar 9, 2026Apr 11, 2026
Updated 5d ago

Evaluation Results

MethodLinks
2025.09
79.9-
2026.04
79.984.3
2025.09
77.2-
2026.04
77.280.1
2026.04
69.4-
2026.04
68.9-
2026.04
68.7-
2026.04
67.7-
2026.04
67-
2026.04
60.6-
2026.04
59.2-
2026.04
57.9-
2026.04
52.159.1
2025.09
51.3-
2025.09
50.6-
2026.04
50.655.2
2025.09
49-
2026.04
4954.1
2025.09
46.1-
2026.04
46.150.4
2026.04
45.152.9
2025.09
44.8-
2026.04
44.848
2025.09
44.2-
2025.09
38.5-
2026.04
38.554.4