Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Dream 7B: Diffusion Large Language Models

About

We introduce Dream 7B, the most powerful open diffusion large language model to date. Unlike autoregressive (AR) models that generate tokens sequentially, Dream 7B employs discrete diffusion modeling to refine sequences in parallel through iterative denoising. Our model consistently outperforms existing diffusion language models on general, mathematical, and coding tasks. Dream 7B demonstrates superior planning abilities and inference flexibility, including arbitrary-order generation, infilling capabilities, and tunable quality-speed trade-offs. These results are achieved through simple yet effective training techniques, including AR-based LLM initialization and context-adaptive token-level noise rescheduling. We release both Dream-Base and Dream-Instruct to facilitate further research in diffusion-based language modeling.

Jiacheng Ye, Zhihui Xie, Lin Zheng, Jiahui Gao, Zirui Wu, Xin Jiang, Zhenguo Li, Lingpeng Kong• 2025

Related benchmarks

TaskDatasetResultRank
Mathematical ReasoningGSM8K
Accuracy60.6
1362
Code GenerationHumanEval
Pass@157.24
1036
Mathematical ReasoningMATH
Accuracy39.2
882
Language UnderstandingMMLU
Accuracy69.5
825
Commonsense ReasoningPIQA
Accuracy55.8
751
Instruction FollowingIFEval
IFEval Accuracy62.5
625
Mathematical ReasoningMATH
Accuracy39.6
535
Mathematical ReasoningMATH500 (test)
Accuracy45
514
Code GenerationHumanEval (test)--
506
Mathematical ReasoningGSM8K--
499
Showing 10 of 153 rows
...

Other info

Follow for update