Timestep-Aware Block Masking for Efficient Diffusion Model Inference

About

Diffusion Probabilistic Models (DPMs) have achieved great success in image generation but suffer from high inference latency due to their iterative denoising nature. Motivated by the evolving feature dynamics across the denoising trajectory, we propose a novel framework to optimize the computational graph of pre-trained DPMs on a per-timestep basis. By learning timestep-specific masks, our method dynamically determines which blocks to execute or bypass through feature reuse at each inference stage. Unlike global optimization methods that incur prohibitive memory costs via full-chain backpropagation, our method optimizes masks for each timestep independently, ensuring a memory-efficient training process. To guide this process, we introduce a timestep-aware loss scaling mechanism that prioritizes feature fidelity during sensitive denoising phases, complemented by a knowledge-guided mask rectification strategy to prune redundant spatial-temporal dependencies. Our approach is architecture-agnostic and demonstrates significant efficiency gains across a broad spectrum of models, including DDPM, LDM, DiT, and PixArt. Experimental results show that by treating the denoising process as a sequence of optimized computational paths, our method achieves a superior balance between sampling speed and generative quality. Our code will be released.

Haodong He, Yuan Gao, Weizhong Zhang, Gui-Song Xia• 2026

Related benchmarks

Task	Dataset	Result
Class-conditional Image Generation	ImageNet 256x256	Inception Score (IS)240.2	967
Class-conditional Image Generation	ImageNet 512x512	FID3.64	126
Unconditional Image Generation	CIFAR-10 32 x 32	FID4.66	71
Unconditional Image Generation	LSUN Bedroom 256x256	FID6.67	68
Unconditional Image Generation	LSUN Churches 256 x 256	FID10.39	26
Prompt-conditional generation	MS-COCO 1024 × 1024	IS55.52	4

Showing 6 of 6 rows

Other info

Follow for update

@wizwand_team Discord