Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

UniGeo: Unifying Geometry Logical Reasoning via Reformulating Mathematical Expression

About

Geometry problem solving is a well-recognized testbed for evaluating the high-level multi-modal reasoning capability of deep models. In most existing works, two main geometry problems: calculation and proving, are usually treated as two specific tasks, hindering a deep model to unify its reasoning capability on multiple math tasks. However, in essence, these two tasks have similar problem representations and overlapped math knowledge which can improve the understanding and reasoning ability of a deep model on both two tasks. Therefore, we construct a large-scale Unified Geometry problem benchmark, UniGeo, which contains 4,998 calculation problems and 9,543 proving problems. Each proving problem is annotated with a multi-step proof with reasons and mathematical expressions. The proof can be easily reformulated as a proving sequence that shares the same formats with the annotated program sequence for calculation problems. Naturally, we also present a unified multi-task Geometric Transformer framework, Geoformer, to tackle calculation and proving problems simultaneously in the form of sequence generation, which finally shows the reasoning ability can be improved on both two tasks by unifying formulation. Furthermore, we propose a Mathematical Expression Pretraining (MEP) method that aims to predict the mathematical expressions in the problem solution, thus improving the Geoformer model. Experiments on the UniGeo demonstrate that our proposed Geoformer obtains state-of-the-art performance by outperforming task-specific model NGS with over 5.6% and 3.2% accuracies on calculation and proving problems, respectively.

Jiaqi Chen, Tong Li, Jinghui Qin, Pan Lu, Liang Lin, Chongyu Chen, Xiaodan Liang• 2022

Related benchmarks

TaskDatasetResultRank
Geometry Problem SolvingGeometry3K (test)
Choice Accuracy59.3
32
Geometry CalculationUniGeo 1.0 (Calculation)
Overall Accuracy62.5
22
Geometry Problem SolvingPGPS9K (test)
Completion35.6
18
Geometry ProvingUniGeo 1.0 (Proving)
Overall Score56.4
15
Geometry Problem SolvingUniGeo CAL (test)
Accuracy62.5
6
Geometry Problem SolvingUniGeo Prv (test)
Accuracy56.4
6
Geometric ProvingUniGeo
Top-1 Acc51.3
4
Showing 7 of 7 rows

Other info

Code

Follow for update