Beyond Fixed Formulas: Data-Driven Linear Predictor for Efficient Diffusion Models

About

To address the high sampling cost of Diffusion Transformers (DiTs), feature caching offers a training-free acceleration method. However, existing methods rely on hand-crafted forecasting formulas that fail under aggressive skipping. We propose L2P (Learnable Linear Predictor), a simple data-driven caching framework that replaces fixed coefficients with learnable per-timestep weights. Rapidly trained in ~20 seconds on a single GPU, L2P accurately reconstructs current features from past trajectories. L2P significantly outperforms existing baselines: it achieves a 4.55x FLOPs reduction and 4.15x latency speedup on FLUX.1-dev, and maintains high visual fidelity under up to 7.18x acceleration on Qwen-Image models, where prior methods show noticeable quality degradation. Our results show learning linear predictors is highly effective for efficient DiT inference. Code is available at https://github.com/Aredstone/L2P-Cache.

Zhirong Shen, Rui Huang, Jiacheng Liu, Chang Zou, Peiliang Cai, Shikang Zheng, Zhengyi Shi, Liang Feng, Linfeng Zhang• 2026

Related benchmarks

Task	Dataset	Result
Text-to-Image Generation	Qwen-Image	--	96
Text-to-Video Generation	HunyuanVideo	LPIPS0.28	44
Text-to-Image Generation	Qwen-Image-Lightning	Latency (s)10.91	3

Showing 3 of 3 rows

Other info

Follow for update

@wizwand_team Discord