Stop-Think-AutoRegress: Language Modeling with Latent Diffusion Planning

About

The Stop-Think-AutoRegress Language Diffusion Model (STAR-LDM) integrates latent diffusion planning with autoregressive generation. Unlike conventional autoregressive language models limited to token-by-token decisions, STAR-LDM incorporates a "thinking" phase that pauses generation to refine a semantic plan through diffusion before continuing. This enables global planning in continuous space prior to committing to discrete tokens. Evaluations show STAR-LDM significantly outperforms similar-sized models on language understanding benchmarks and achieves $>70\%$ win rates in LLM-as-judge comparisons for narrative coherence and commonsense reasoning. The architecture also allows straightforward control through lightweight classifiers, enabling fine-grained steering of attributes without model retraining while maintaining better fluency-control trade-offs than specialized approaches.

Justin Lovelace, Christian Belardi, Sofian Zalouk, Adhitya Polavaram, Srivatsa Kundurthy, Kilian Q. Weinberger• 2026

Related benchmarks

Task	Dataset	Result	Rank
Language Generation	C4 (val)	OLMo Perplexity29.8		15
Natural Language Understanding	NLU suite Zero-Shot (CSQA, SIQA, HS, WG, PIQA, OBQA, ARC:E, ARC:C)	CSQA Accuracy49.8		8

Showing 2 of 2 rows

Other info

Follow for update

@wizwand_team Discord