Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Guided Star-Shaped Masked Diffusion

About

The performance of pre-trained masked diffusion models is often constrained by their sampling procedure, which makes decisions irreversible and struggles in low-step generation regimes. We introduce a novel sampling algorithm that works with pre-trained models and, after a lightweight fine-tuning of a single layer, significantly improves sample quality and efficiency. Our method reformulates the generation process using a star-shaped paradigm, which inherently allows for error correction. To make this process effective, we augment it with a learnable re-masking scheduler that intelligently identifies and revises likely errors. This approach yields a substantial quality boost, particularly when using a small number of sampling steps. We extensively ablate key components of our approach and show its usability in different scenarios. In comprehensive experiments on text, and code generation, our sampling algorithm outperforms or matches existing methods.

Viacheslav Meshchaninov, Egor Shibaev, Artem Makoian, Ivan Klimov, Nikita Balagansky, Daniil Gavrilov, Aibek Alanov, Dmitry Vetrov• 2025

Related benchmarks

TaskDatasetResultRank
Code GenerationHumanEval--
1036
Instruction FollowingIFEval--
625
Multi-task Language UnderstandingMMLU
MMLU Score71.2
112
Multi-task Language UnderstandingMMLU-Pro
Accuracy47.9
55
Conditional Code GenerationConala
Conditional Perplexity16.4
15
Science Question AnsweringGPQA
Score32.8
11
Showing 6 of 6 rows

Other info

Follow for update