Training-Free Reward-Guided Image Editing via Trajectory Optimal Control

About

Recent advancements in diffusion and flow-matching models have demonstrated remarkable capabilities in high-fidelity image synthesis. A prominent line of research involves reward-guided guidance, which steers the generation process during inference to align with specific objectives. However, leveraging this reward-guided approach to the task of image editing, which requires preserving the semantic content of the source image while enhancing a target reward, is largely unexplored. In this work, we introduce a novel framework for training-free, reward-guided image editing. We formulate the editing process as a trajectory optimal control problem where the reverse process of a diffusion model is treated as a controllable trajectory originating from the source image, and the adjoint states are iteratively updated to steer the editing process. Through extensive experiments across distinct editing tasks, we demonstrate that our approach significantly outperforms existing inversion-based training-free guidance baselines, achieving a superior balance between reward maximization and fidelity to the source image without reward hacking.

Jinho Chang, Jaemin Kim, Jong Chul Ye• 2025

Related benchmarks

Task	Dataset	Result
Text-Guided Image Editing	CelebA-HQ	CLIP0.3441	10
Counterfactual Generation	ImageNet-1K	LogitTgt23.372	6
Counterfactual Generation	StableDiffusion 3 Evaluation Set	LogitTgt24.572	6
Human Preference	StableDiffusion Evaluation Set 3	ImageReward1.8529	6
Human preference optimization	REFL (300 random images)	ImageReward1.8914	6
Style Transfer	Pick-a-Pic (test)	\|\|ΔG\|\|F5.0185	6
Text-Guided Image Editing	StableDiffusion Evaluation Set 3	CLIPScore0.3491	6
Style Transfer	StableDiffusion 3 (evaluation)	\|\|ΔG\|\|F4.5333	6
Text-Guided Image Editing	User Study 50 images	Reward Alignment3.67	5

Showing 9 of 9 rows

Other info

Follow for update

@wizwand_team Discord