Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

ReGuLaR: Variational Latent Reasoning Guided by Rendered Chain-of-Thought

About

While Chain-of-Thought (CoT) significantly enhances the performance of Large Language Models (LLMs), explicit reasoning chains introduce substantial computational redundancy. Recent latent reasoning methods attempt to mitigate this by compressing reasoning processes into latent space, but often suffer from severe performance degradation due to the lack of appropriate compression guidance. In this study, we propose Rendered CoT-Guided variational Latent Reasoning (ReGuLaR), a simple yet novel latent learning paradigm resolving this issue. Fundamentally, we formulate latent reasoning within the Variational Auto-Encoding (VAE) framework, sampling the current latent reasoning state from the posterior distribution conditioned on previous ones. Specifically, when learning this variational latent reasoning model, we render explicit reasoning chains as images, from which we extract dense visual-semantic representations to regularize the posterior distribution, thereby achieving efficient compression with minimal information loss. Extensive experiments demonstrate that ReGuLaR significantly outperforms existing latent reasoning methods across both computational efficiency and reasoning effectiveness, and even surpasses CoT through multi-modal reasoning, providing a new and insightful solution to latent reasoning. Code: https://github.com/FanmengWang/ReGuLaR.

Fanmeng Wang, Haotian Liu, Guojiang Zhao, Hongteng Xu, Zhifeng Gao• 2026

Related benchmarks

TaskDatasetResultRank
Mathematical ReasoningGSM8k Aug
Accuracy34.9
35
Math ReasoningGSM-Hard
Accuracy8.27
31
Mathematical ReasoningAverage GSM8k-Aug, GSM-Hard, SVAMP, MultiArith
Accuracy45.6
26
Molecular CaptioningMolReasoner original (test)
BLEU-20.461
18
Math ReasoningMultiArith
Accuracy89.2
14
Math ReasoningSVAMP
Accuracy50.1
10
Showing 6 of 6 rows

Other info

GitHub

Follow for update