Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Improving Sampling for Masked Diffusion Models via Information Gain

About

Masked Diffusion Models (MDMs) enable flexible decoding orders, yet existing samplers remain largely greedy, selecting locally certain tokens without accounting for their downstream effects. We show that this myopia can increase cumulative uncertainty and lead to suboptimal generation. To address this, we propose the **Info-Gain Sampler**, a training-free decoding method that uses the bidirectional structure of MDMs to balance immediate uncertainty with the information gained over remaining masked positions. Across reasoning, coding, creative writing, and image generation tasks, Info-Gain Sampler consistently outperforms existing MDM samplers, improving average reasoning accuracy by 2.9--11.6 percentage points and achieving a 62.8% average win rate in creative writing. The code is available at https://github.com/yks23/Information-Gain-Sampler.

Kaisen Yang, Jayden Teoh, Kaicheng Yang, Yitong Zhang, Alex Lamb• 2026

Related benchmarks

TaskDatasetResultRank
Mathematical ReasoningGSM8K
Accuracy83.3
1398
Mathematical ReasoningGSM8K (test)
Accuracy88.9
954
Text-to-Image GenerationGenEval--
218
Visual Question AnsweringInfoVQA
Accuracy33.37
195
Code GenerationMBPP
Accuracy48.4
165
PlanningSudoku
Accuracy84.4
129
Information Visual Question AnsweringInfoVQA
Accuracy33.26
110
Multi-modal ReasoningM3CoT
Accuracy39.23
90
PlanningCountdown
Accuracy45.2
89
Multi-modal Question AnsweringMMBench
Accuracy67.73
84
Showing 10 of 20 rows

Other info

Follow for update