Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

STORK: Faster Diffusion And Flow Matching Sampling By Resolving Both Stiffness And Structure-Dependence

About

Diffusion models (DMs) and flow-matching models have demonstrated remarkable performance in image and video generation. However, such models require a significant number of function evaluations (NFEs) during sampling, leading to costly inference. Consequently, quality-preserving fast sampling methods that require fewer NFEs have been an active area of research. However, prior training-free sampling methods fail to simultaneously address two key challenges: the stiffness of the ODE (i.e., the non-straightness of the velocity field) and dependence on the semi-linear structure of the DM ODE (which limits their direct applicability to flow-matching models). In this work, we introduce the Stabilized Taylor Orthogonal Runge--Kutta (STORK) method, addressing both design concerns. We demonstrate that STORK consistently improves the quality of diffusion and flow-matching sampling for image and video generation. Code is available at https://github.com/ZT220501/STORK.

Zheng Tan, Weizhen Wang, Andrea L. Bertozzi, Ernest K. Ryu• 2025

Related benchmarks

TaskDatasetResultRank
Class-conditional Image GenerationImageNet 256x256--
967
Text-to-Image GenerationQwen-Image
Image Reward1.2934
96
Text-to-Video GenerationEvalCrafter
Text-Video Alignment42.06
34
Showing 3 of 3 rows

Other info

Follow for update