One-Step Image Translation with Text-to-Image Models

About

In this work, we address two limitations of existing conditional diffusion models: their slow inference speed due to the iterative denoising process and their reliance on paired data for model fine-tuning. To tackle these issues, we introduce a general method for adapting a single-step diffusion model to new tasks and domains through adversarial learning objectives. Specifically, we consolidate various modules of the vanilla latent diffusion model into a single end-to-end generator network with small trainable weights, enhancing its ability to preserve the input image structure while reducing overfitting. We demonstrate that, for unpaired settings, our model CycleGAN-Turbo outperforms existing GAN-based and diffusion-based methods for various scene translation tasks, such as day-to-night conversion and adding/removing weather effects like fog, snow, and rain. We extend our method to paired settings, where our model pix2pix-Turbo is on par with recent works like Control-Net for Sketch2Photo and Edge2Image, but with a single-step inference. This work suggests that single-step diffusion models can serve as strong backbones for a range of GAN learning objectives. Our code and models are available at https://github.com/GaParmar/img2img-turbo.

Gaurav Parmar, Taesung Park, Srinivasa Narasimhan, Jun-Yan Zhu• 2024

Related benchmarks

Task	Dataset	Result
Pedestrian Detection	ECP Night (val)	LAMR (Reasonable)9	22
Image-to-Image Translation	X-DigiSkull (test)	CFID147.4	14
Image Restoration	RESIDE (test)	PSNR16.732	7
Image Restoration	OTS (test)	PSNR17.8744	7
Explainability Stability Analysis	Instruction-based Image Editing Stability Evaluation 10 prompts, 30 perturbations	Jaccard Index85	6
Day-to-Night Image Translation	ECP (val)	WD37.4	5
Instruction-based Image Editing Consistency	Transform the weather to make it snowing prompt 1000 iterations (30 perturbations)	Variance1.00e-4	3
Fidelity Analysis	gSMILE Fidelity Analysis Prompt: 'Transform the weather to make it snowing' (test)	WMSE0.0193	3

Showing 8 of 8 rows

Other info

Follow for update

@wizwand_team Discord