Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Figure It Out: Improve the Frontier of Reasoning with Executable Visual States

About

Complex reasoning problems often involve implicit spatial and geometric relationships that are not explicitly encoded in text. While recent reasoning models perform well across many domains, purely text-based reasoning struggles to capture structural constraints in complex settings. In this paper, we introduce FIGR, which integrates executable visual construction into multi-turn reasoning via end-to-end reinforcement learning. Rather than relying solely on textual chains of thought, FIGR externalizes intermediate hypotheses by generating executable code that constructs diagrams within the reasoning loop. An adaptive reward mechanism selectively regulates when visual construction is invoked, enabling more consistent reasoning over latent global properties that are difficult to infer from text alone. Experiments on eight challenging mathematical benchmarks demonstrate that FIGR outperforms strong text-only chain-of-thought baselines, improving the base model by 13.12% on AIME 2025 and 11.00% on BeyondAIME. These results highlight the effectiveness of precise, controllable figure construction of FIGR in enhancing complex reasoning ability.

Meiqi Chen, Fandong Meng, Jie Zhou• 2025

Related benchmarks

TaskDatasetResultRank
Mathematical ReasoningAIME 2024
Accuracy79.58
251
Mathematical ReasoningAIME 2025
Accuracy79.32
227
Mathematical ReasoningAMC
Accuracy93.98
151
Mathematical ReasoningAMC
Pass@193.98
112
Mathematical ReasoningMinerva Math
Accuracy44.49
100
Mathematical ReasoningAIME 2025
Pass@179.32
96
Mathematical ReasoningAIME 2024
Pass@179.58
86
Mathematical ReasoningMinerva Math
pass@1 Accuracy44.49
82
Mathematical ReasoningBeyond AIME
Accuracy54
32
Mathematical ReasoningBeyond AIME
Pass@10.54
21
Showing 10 of 13 rows

Other info

GitHub

Follow for update