Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Gold-Medal-Level Olympiad Geometry Solving with Efficient Heuristic Auxiliary Constructions

About

Automated theorem proving in Euclidean geometry, particularly for International Mathematical Olympiad (IMO) level problems, remains a major challenge and an important research focus in Artificial Intelligence. In this paper, we present a highly efficient method for geometry theorem proving that runs entirely on CPUs without relying on neural network-based inference. Our initial study shows that a simple random strategy for adding auxiliary points can achieve silver-medal level human performance on IMO. Building on this, we propose HAGeo, a Heuristic-based method for adding Auxiliary constructions in Geometric deduction that solves 28 of 30 problems on the IMO-30 benchmark, achieving gold-medal level performance and surpassing AlphaGeometry, a competitive neural network-based approach, by a notable margin. To evaluate our method and existing approaches more comprehensively, we further construct HAGeo-409, a benchmark consisting of 409 geometry problems with human-assessed difficulty levels. Compared with the widely used IMO-30, our benchmark poses greater challenges and provides a more precise evaluation, setting a higher bar for geometry theorem proving.

Boyan Duan, Xiao Liang, Shuai Lu, Yaoxiang Wang, Yelong Shen, Kai-Wei Chang, Ying Nian Wu, Mao Yang, Weizhu Chen, Yeyun Gong• 2025

Related benchmarks

TaskDatasetResultRank
Euclidean geometry problem solvingIMO-30
Solved Problems28
19
Euclidean geometry problem solvingHAGeo-409
Solved Problems287
16
Geometry Theorem ProvingHAGeo-409 (Level 1-3)
Solved Count149
5
Geometry Theorem ProvingHAGeo-409 (Level 3-4)
Solved Count93
5
Geometry Theorem ProvingHAGeo-409 Level 4-5
Solved Count36
5
Geometry Theorem ProvingHAGeo-409 Level 5-6
Solved Count7
5
Geometry Theorem ProvingHAGeo-409 Level 6-7
Solved Count2
5
Geometry Theorem ProvingHAGeo-409 (Total)
Solved Count287
5
Geometry Theorem ProvingIMO-30 (test)
Solved Problems28
4
Showing 9 of 9 rows

Other info

GitHub

Follow for update