Figure It Out: Improve the Frontier of Reasoning with Executable Visual States

About

Complex reasoning problems often involve implicit spatial and geometric relationships that are not explicitly encoded in text. While recent reasoning models perform well across many domains, purely text-based reasoning struggles to capture structural constraints in complex settings. In this paper, we introduce FIGR, which integrates executable visual construction into multi-turn reasoning via end-to-end reinforcement learning. Rather than relying solely on textual chains of thought, FIGR externalizes intermediate hypotheses by generating executable code that constructs diagrams within the reasoning loop. An adaptive reward mechanism selectively regulates when visual construction is invoked, enabling more consistent reasoning over latent global properties that are difficult to infer from text alone. Experiments on eight challenging mathematical benchmarks demonstrate that FIGR outperforms strong text-only chain-of-thought baselines, improving the base model by 13.12% on AIME 2025 and 11.00% on BeyondAIME. These results highlight the effectiveness of precise, controllable figure construction of FIGR in enhancing complex reasoning ability.

Meiqi Chen, Fandong Meng, Jie Zhou• 2025

Related benchmarks

Task	Dataset	Result
Mathematical Reasoning	AIME 2024	Accuracy79.58	370
Mathematical Reasoning	Minerva Math	Accuracy44.49	228
Mathematical Reasoning	AIME 2025	Accuracy79.32	227
Mathematical Reasoning	AMC	Accuracy93.98	221
Mathematical Reasoning	AMC	Pass@193.98	112
Mathematical Reasoning	Minerva Math	pass@1 Accuracy44.49	104
Mathematical Reasoning	AIME 2025	Pass@179.32	96
Mathematical Reasoning	AIME 2024	Pass@179.58	86
Mathematical Reasoning	Beyond AIME	Accuracy54	45
Mathematical Reasoning	OlympBench	Pass@172.4	29

Showing 10 of 13 rows

Other info

GitHub

Follow for update

@wizwand_team Discord