VDE: Training-Free Accelerating Rectified Flow Model via Velocity Decomposition and Estimation

About

Though rectified flow models have achieved remarkable performance in image, video, and 3D generation, their practical deployments are challenged by slow inference speeds. Prior acceleration methods reuse cached features from previous steps, which neglects the growing mismatch between static caches and the evolving input, leading to reduced output fidelity. This work proposes Velocity Decomposition and Estimation (VDE), a training-free acceleration method that shifts the paradigm from caching-and-reusing to decomposing-and-estimating. Specifically, VDE decomposes the model's velocity into components parallel and orthogonal to the input, exploiting their temporal predictability and directional stability for precise, input-adaptive estimation. To prevent error accumulation, it periodically anchors the model's state via full forward passes. Extensive experiments on image and video generation tasks demonstrate that VDE achieves substantial acceleration with minimal loss in visual quality. Notably, VDE accelerates Flux by 3.22 times and achieves an LPIPS of 0.069 on Qwen-Image, outperforming the best baseline with a 52.2% reduction.

Junwen Tan, Jinglin Liang, Hongyuan Chen, Shuangping Huang• 2026

Related benchmarks

Task	Dataset	Result
Text-to-Image Generation	Qwen-Image	Image Reward1.295	96
Video Generation	Wan 1.3B (81 frames, 832×480) 2.1	VBench Score80.43	21
Image Generation	FLUX.1 (dev)	Image Reward0.978	20

Showing 3 of 3 rows

Other info

Follow for update

@wizwand_team Discord