Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

SuperFlow: Training Flow Matching Models with RL on the Fly

About

Recent progress in flow-based generative models and reinforcement learning (RL) has improved text-image alignment and visual quality. However, current RL training for flow models still has two main problems: (i) GRPO-style fixed per-prompt group sizes ignore variation in sampling importance across prompts, which leads to inefficient sampling and slower training; and (ii) trajectory-level advantages are reused as per-step estimates, which biases credit assignment along the flow. We propose SuperFlow, an RL training framework for flow-based models that adjusts group sizes with variance-aware sampling and computes step-level advantages in a way that is consistent with continuous-time flow dynamics. Empirically, SuperFlow reaches promising performance while using only 5.4% to 56.3% of the original training steps and reduces training time by 5.2% to 16.7% without any architectural changes. On standard text-to-image (T2I) tasks, including text rendering, compositional image generation, and human preference alignment, SuperFlow improves over SD3.5-M by 4.6% to 47.2%, and over Flow-GRPO by 1.7% to 16.0%.

Kaijie Chen, Zhiyang Xu, Ying Shen, Zihao Lin, Yuguang Yao, Lifu Huang• 2025

Related benchmarks

TaskDatasetResultRank
Compositional Image GenerationGenEval
Overall Score0.8
84
Human Preference AlignmentPickScore
PickScore86.851
20
Text RenderingText Rendering
OCR Score84.128
4
Showing 3 of 3 rows

Other info

Follow for update