ReF-LDM: A Latent Diffusion Model for Reference-based Face Image Restoration

About

While recent works on blind face image restoration have successfully produced impressive high-quality (HQ) images with abundant details from low-quality (LQ) input images, the generated content may not accurately reflect the real appearance of a person. To address this problem, incorporating well-shot personal images as additional reference inputs could be a promising strategy. Inspired by the recent success of the Latent Diffusion Model (LDM), we propose ReF-LDM, an adaptation of LDM designed to generate HQ face images conditioned on one LQ image and multiple HQ reference images. Our model integrates an effective and efficient mechanism, CacheKV, to leverage the reference images during the generation process. Additionally, we design a timestep-scaled identity loss, enabling our LDM-based model to focus on learning the discriminating features of human faces. Lastly, we construct FFHQ-Ref, a dataset consisting of 20,405 high-quality (HQ) face images with corresponding reference images, which can serve as both training and evaluation data for reference-based face restoration models.

Chi-Wei Hsiao, Yu-Lun Liu, Cheng-Kun Yang, Sheng-Po Kuo, Kevin Jou, Chia-Ping Chen• 2024

Related benchmarks

Task	Dataset	Result
Face Restoration	FFHQ-Ref Severe	Ref-ArcFace0.595	11
Video Face Restoration	VFHQ milder degradation settings (test)	PSNR26.71	11
Face Video Restoration	VFHQ heavy degradation (test)	PSNR23.113	11
Face Restoration	Real-world face restoration	NIQE4.31	9
Face Restoration	Same-age synthetic (test)	PSNR24.8	9
Face Restoration	Cross-age synthetic (test)	PSNR24.73	9
Face Restoration	Real-world 1.0 (test)	MUSIQ Score68.04	8
Cross-Age Face Restoration	Cross-Age Data (test)	PSNR24.58	8
Same-Age Face Restoration	Same-Age Data (test)	PSNR24.8	8
Face Restoration	Cross-age Face Restoration Evaluation Set (inference)	Inference Time (s)1.79	7

Showing 10 of 19 rows

Other info

Follow for update

@wizwand_team Discord