MagCache: Fast Video Generation with Magnitude-Aware Cache

About

Existing acceleration techniques for video diffusion models often rely on uniform heuristics or time-embedding variants to skip timesteps and reuse cached features. These approaches typically require extensive calibration with curated prompts and risk inconsistent outputs due to prompt-specific overfitting. In this paper, we introduce a novel and robust discovery: a unified magnitude law observed across different models and prompts. Specifically, the magnitude ratio of successive residual outputs decreases monotonically, steadily in most timesteps while rapidly in the last several steps. Leveraging this insight, we introduce a Magnitude-aware Cache (MagCache) that adaptively skips unimportant timesteps using an error modeling mechanism and adaptive caching strategy. Unlike existing methods requiring dozens of curated samples for calibration, MagCache only requires a single sample for calibration. Experimental results show that MagCache achieves 2.10x-2.68x speedups on Open-Sora, CogVideoX, Wan 2.1, and HunyuanVideo, while preserving superior visual fidelity. It significantly outperforms existing methods in LPIPS, SSIM, and PSNR, under similar computational budgets.

Zehong Ma, Longhui Wei, Feng Wang, Shiliang Zhang, Qi Tian• 2025

Related benchmarks

Task	Dataset	Result
Video Generation	Wan 1.3B (81 frames, 832×480) 2.1	VBench Score80.26	21
Video Generation	HunyuanVideo 129 frames 544P	VBench Score81.99	15
Image Generation	Flux (test)	ImgReward0.993	14
Video Generation	Open-Sora 51 frames 848 x 480 1.2	Latency (s)26.07	11
Image Generation	DrawBench	Latency (s)2.52	10
Text-to-Image Generation	FLUX.1 (dev)	PSNR29.96	8
Talking Head Generation	HunyuanVideo Avatar	FID26.43	8
Speech-to-Video Generation	Wan-S2V	FID32.93	7
Text-to-Image Generation	FLUX.1 T=50 (dev)	GenEval64.61	7
Text-to-Image Generation	BAGEL T=50	GenEval87.29	7

Showing 10 of 17 rows

Other info

Follow for update

@wizwand_team Discord