Theorem-Validated Reverse Chain-of-Thought Problem Generation for Geometric Reasoning

About

Large Multimodal Models (LMMs) face limitations in geometric reasoning due to insufficient Chain of Thought (CoT) image-text training data. While existing approaches leverage template-based or LLM-assisted methods for geometric CoT data creation, they often face challenges in achieving both diversity and precision. To bridge this gap, we introduce a two-stage Theorem-Validated Reverse Chain-of-Thought Reasoning Synthesis (TR-CoT) framework. The first stage, TR-Engine, synthesizes theorem-grounded geometric diagrams with structured descriptions and properties. The second stage, TR-Reasoner, employs reverse reasoning to iteratively refine question-answer pairs by cross-validating geometric properties and description fragments. Our approach expands theorem-type coverage, corrects long-standing misunderstandings, and enhances geometric reasoning. Fine-grained CoT improves theorem understanding and increases logical consistency by 24.5%. Our best models surpass the baselines in MathVista and GeoQA by 10.1% and 4.7%, outperforming advanced closed-source models like GPT-4o.

Linger Deng, Linghao Zhu, Yuliang Liu, Yu Wang, Qunyi Xie, Jingjing Wu, Gang Zhang, Yingying Zhu, Xiang Bai• 2024

Related benchmarks

Task	Dataset	Result
Multimodal Reasoning	WeMath	Accuracy57.59	199
Multimodal Reasoning	MathVision	--	162
Multimodal Reasoning	MathVerse	--	138
Multimodal Reasoning	MathVista	Pass@162.6	36
Geometric Question Answering	GeoQA (test)	Total Accuracy79.2	34
Math Reasoning	MathVista GPS (test)	Accuracy73.1	31
Multimodal Mathematical Reasoning	MathVista 14 (1000)	Macro Score61.8	22
Geometric Question Answering	GeoQA	Success Rate79.2	14
Multimodal Math Question Answering	MathVista GPS	Accuracy73.1	13
Multimodal Reasoning	GeoQA	Mean@146.49	11

Showing 10 of 11 rows

Other info

Follow for update

@wizwand_team Discord