Momentum Guidance: Plug-and-Play Guidance for Flow Models

About

Flow-based generative models have become a strong framework for high-quality generative modeling, yet pretrained models are rarely used in their vanilla conditional form: conditional samples without guidance often appear diffuse and lack fine-grained detail due to the smoothing effects of neural networks. Existing guidance techniques such as classifier-free guidance (CFG) improve fidelity but double the inference cost and typically reduce sample diversity. We introduce Momentum Guidance (MG), a new dimension of guidance that leverages the ODE trajectory itself. MG extrapolates the current velocity using an exponential moving average of past velocities and preserves the standard one-evaluation-per-step cost. It matches the effect of standard guidance without extra computation and can further improve quality when combined with CFG. Experiments demonstrate MG's effectiveness across benchmarks. Specifically, on ImageNet-256, MG achieves average improvements in FID of 36.68% without CFG and 25.52% with CFG across various sampling settings, attaining an FID of 1.597 at 64 sampling steps. Evaluations on large flow-based models like Stable Diffusion 3 and FLUX.1-dev further confirm consistent quality enhancements across standard metrics.

Runlong Liao, Jian Yu, Baiyu Su, Chi Zhang, Lizhang Chen, Qiang Liu• 2026

Related benchmarks

Task	Dataset	Result
Class-conditional image synthesis	ImageNet 256x256 (val)	FID1.6	61
Image Generation	ImageNet-256 (FID-50K)	FID1.37	36
Text-to-Image Generation	Flux (dev)	HPSv2.1 Score31.47	14
Text-to-Image Synthesis	SD3 (test)	HPSv2.130.62	14

Showing 4 of 4 rows

Other info

Follow for update

@wizwand_team Discord