Stop-Think-AutoRegress: Language Modeling with Latent Diffusion Planning
About
The Stop-Think-AutoRegress Language Diffusion Model (STAR-LDM) integrates latent diffusion planning with autoregressive generation. Unlike conventional autoregressive language models limited to token-by-token decisions, STAR-LDM incorporates a "thinking" phase that pauses generation to refine a semantic plan through diffusion before continuing. This enables global planning in continuous space prior to committing to discrete tokens. Evaluations show STAR-LDM significantly outperforms similar-sized models on language understanding benchmarks and achieves $>70\%$ win rates in LLM-as-judge comparisons for narrative coherence and commonsense reasoning. The architecture also allows straightforward control through lightweight classifiers, enabling fine-grained steering of attributes without model retraining while maintaining better fluency-control trade-offs than specialized approaches.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Language Generation | C4 (val) | OLMo Perplexity29.8 | 15 | |
| Natural Language Understanding | NLU suite Zero-Shot (CSQA, SIQA, HS, WG, PIQA, OBQA, ARC:E, ARC:C) | CSQA Accuracy49.8 | 8 |