Rectified-CFG++ for Flow Based Models
About
Classifier-free guidance (CFG) is the workhorse for steering large diffusion models toward text-conditioned targets, yet its native application to rectified flow (RF) based models provokes severe off-manifold drift, yielding visual artifacts, text misalignment, and brittle behaviour. We present Rectified-CFG++, an adaptive predictor-corrector guidance that couples the deterministic efficiency of rectified flows with a geometry-aware conditioning rule. Each inference step first executes a conditional RF update that anchors the sample near the learned transport path, then applies a weighted conditional correction that interpolates between conditional and unconditional velocity fields. We prove that the resulting velocity field is marginally consistent and that its trajectories remain within a bounded tubular neighbourhood of the data manifold, ensuring stability across a wide range of guidance strengths. Extensive experiments on large-scale text-to-image models (Flux, Stable Diffusion 3/3.5, Lumina) show that Rectified-CFG++ consistently outperforms standard CFG on benchmark datasets such as MS-COCO, LAION-Aesthetic, and T2I-CompBench. Project page: https://rectified-cfgpp.github.io/
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Text-to-Image Generation | GenEval | Overall Score59.57 | 506 | |
| Text-to-Image Generation | Pick-a-Pic | ImageReward1.08 | 107 | |
| Text-to-Image Generation | DrawBench | Pick Score23.15 | 40 | |
| Text-to-Image Generation | LAION 5B 1K | HPSv2.128.306 | 18 | |
| Text-to-Image Generation | MS COCO 1K | HPSv2.128.932 | 18 | |
| Text to Image | MS-COCO 5k image-text pairs | FID20.55 | 15 |