Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

SpatialMath: Spatial Comprehension-Infused Symbolic Reasoning for Mathematical Problem-Solving

About

Multimodal Small-to-Medium sized Language Models (MSLMs) have demonstrated strong capabilities in integrating visual and textual information but still face significant limitations in visual comprehension and mathematical reasoning, particularly in geometric problems with diverse levels of visual infusion. Current models struggle to accurately decompose intricate visual inputs and connect perception with structured reasoning, leading to suboptimal performance. To address these challenges, we propose SpatialMath, a novel Spatial Comprehension-Infused Symbolic Reasoning Framework designed to integrate spatial representations into structured symbolic reasoning chains. SpatialMath employs a specialized perception module to extract spatially-grounded representations from visual diagrams, capturing critical geometric structures and spatial relationships. These representations are then methodically infused into symbolic reasoning chains, facilitating visual comprehension-aware structured reasoning. To this end, we introduce MATHVERSE-PLUS, a novel dataset containing structured visual interpretations and step-by-step reasoning paths for vision-intensive mathematical problems. SpatialMath significantly outperforms strong multimodal baselines, achieving up to 10 percentage points improvement over supervised fine-tuning with data augmentation in vision-intensive settings. Robustness analysis reveals that enhanced spatial representations directly improve reasoning accuracy, reinforcing the need for structured perception-to-reasoning pipelines in MSLMs.

Ashutosh Bajpai, Akshat Bhandari, Akshay Nambi, Tanmoy Chakraborty• 2026

Related benchmarks

TaskDatasetResultRank
Mathematical ReasoningMathVista
Accuracy6.16
97
Mathematical ReasoningMathVision
Accuracy19.08
38
Mathematical ReasoningMathVerse-Plus All 1.0 (test)
Accuracy43.6
12
Mathematical ReasoningMathVerse-Plus Text Dominant 1.0 (test)
Accuracy53
12
Mathematical ReasoningMathVerse-Plus Text Lite 1.0 (test)
Accuracy48
12
Mathematical ReasoningMathVerse-Plus Vision Intensive 1.0 (test)
Accuracy45
12
Mathematical ReasoningMathVerse-Plus Vision Dominant 1.0 (test)
Accuracy41
12
Mathematical ReasoningMathVerse-Plus Vision Only 1.0 (test)
Accuracy31
12
Geometry reasoningGeometry3K
Geometry Score14.81
7
Showing 9 of 9 rows

Other info

Follow for update