Flow-Direct: Feedback-Efficient and Reusable Guidance for Flow Models via Non-Parametric Guidance Field
About
Training-free guidance enables pre-trained diffusion and flow models to optimize application-specific objectives using feedback from external black-box reward functions. However, existing methods are feedback-inefficient because reward feedback is used only transiently to inform a localized gradient approximation or a discrete search decision, and is subsequently discarded. To address this limitation, we propose Flow-Direct, a framework that guides the generation process via a persistent guidance field. Theoretically, this guidance field is analytically derived from the log-density ratio between the base and reward-weighted target distributions; it transports the pre-trained distribution to the target distribution. In practice, the field is implemented as a non-parametric estimator constructed from all accumulated reward-evaluated samples. As more samples are collected during optimization, this empirical guidance field becomes increasingly accurate. This persistent formulation yields two major advantages. First, Flow-Direct is highly feedback-efficient: because every evaluated sample is used to refine the global guidance field, no reward information is wasted. Second, the framework is naturally reusable: once optimization is complete, the collected dataset defines a reusable guidance field for generating novel target samples without additional reward evaluations, and distinct guidance fields can be combined to generate samples that simultaneously satisfy multiple objectives.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Semantic Attribute Alignment | Gemma animal-attribute prompts | Happy Score26.11 | 9 | |
| 3D Aerodynamic optimization | 3D Vehicle models | Aerodynamic Score0.16 | 5 | |
| Aesthetic Reward Optimization | Animal prompts 2D image generation | Aesthetic Score7.18 | 5 | |
| Compressibility optimization | Animal prompts 2D image generation | Compressibility14.63 | 5 | |
| HPSv3 Reward Optimization | Animal prompts 2D image generation | HPSv3 Score10.94 | 5 | |
| Incompressibility optimization | Animal prompts 2D image generation | Incompressibility Score284.9 | 5 | |
| Aerodynamic Optimization | 3D Vehicle dataset | Efficiency Gain Factor77.78 | 4 | |
| Aesthetic Reward Optimization | 6 animal prompts | Efficiency Gain Factor61.7 | 4 | |
| Compressibility Reward Optimization | 6 animal prompts | Efficiency Gain Factor54.17 | 4 | |
| HPSv3 Reward Optimization | 6 animal prompts | Efficiency Gain Factor100 | 4 |