SinSR: Diffusion-Based Image Super-Resolution in a Single Step

About

While super-resolution (SR) methods based on diffusion models exhibit promising results, their practical application is hindered by the substantial number of required inference steps. Recent methods utilize degraded images in the initial state, thereby shortening the Markov chain. Nevertheless, these solutions either rely on a precise formulation of the degradation process or still necessitate a relatively lengthy generation path (e.g., 15 iterations). To enhance inference speed, we propose a simple yet effective method for achieving single-step SR generation, named SinSR. Specifically, we first derive a deterministic sampling process from the most recent state-of-the-art (SOTA) method for accelerating diffusion-based SR. This allows the mapping between the input random noise and the generated high-resolution image to be obtained in a reduced and acceptable number of inference steps during training. We show that this deterministic mapping can be distilled into a student model that performs SR within only one inference step. Additionally, we propose a novel consistency-preserving loss to simultaneously leverage the ground-truth image during the distillation process, ensuring that the performance of the student model is not solely bound by the feature manifold of the teacher model, resulting in further performance improvement. Extensive experiments conducted on synthetic and real-world datasets demonstrate that the proposed method can achieve comparable or even superior performance compared to both previous SOTA methods and the teacher model, in just one sampling step, resulting in a remarkable up to x10 speedup for inference. Our code will be released at https://github.com/wyf0912/SinSR

Yufei Wang, Wenhan Yang, Xinyuan Chen, Yaohui Wang, Lanqing Guo, Lap-Pui Chau, Ziwei Liu, Yu Qiao, Alex C. Kot, Bihan Wen• 2023

Related benchmarks

Task	Dataset	Result
Object Detection	COCO 2017 (val)	--	2843
Instance Segmentation	COCO 2017 (val)	APm0.12	1275
Semantic segmentation	ADE20K	mIoU19.6	1028
Super-Resolution	Set5	PSNR23.91	821
Image Super-resolution	RealSR	PSNR26.28	190
Image Super-resolution	DIV2K (val)	LPIPS0.3164	189
Image Super-resolution	DRealSR	MUSIQ55.64	149
Super-Resolution	DIV2K	PSNR24.29	145
Super-Resolution	RealSR (test)	PSNR26.32	92
Super-Resolution	ImageNet (test)	LPIPS0.218	70

Showing 10 of 78 rows

...

Other info

Code

Follow for update

@wizwand_team Discord