RelightVid: Temporal-Consistent Diffusion Model for Video Relighting

About

Diffusion models have demonstrated remarkable success in image generation and editing, with recent advancements enabling albedo-preserving image relighting. However, applying these models to video relighting remains challenging due to the lack of paired video relighting datasets and the high demands for output fidelity and temporal consistency, further complicated by the inherent randomness of diffusion models. To address these challenges, we introduce RelightVid, a flexible framework for video relighting that can accept background video, text prompts, or environment maps as relighting conditions. Trained on in-the-wild videos with carefully designed illumination augmentations and rendered videos under extreme dynamic lighting, RelightVid achieves arbitrary video relighting with high temporal consistency without intrinsic decomposition while preserving the illumination priors of its image backbone.

Ye Fang, Zeyi Sun, Shangzhan Zhang, Tong Wu, Yinghao Xu, Pan Zhang, Jiaqi Wang, Gordon Wetzstein, Dahua Lin• 2025

Related benchmarks

Task	Dataset	Result
Delight	Pixel Cube Subject 2	PSNR20.5	8
Video Relighting and Recoloring	DAVIS (val)	SC0.2322	6
Video Harmonization	Curated Portrait Video Dataset	PSNR15.7	5
Foreground Video Relighting	Background image-conditioned foreground video relighting dataset (test)	Aesthetic Score0.635	5
Relight	Pixel Cube Subject 3	PSNR17.6	4
Delight	Pixel Cube Subject 1	PSNR7.18	4
Delight	Pixel Cube Subject 3	PSNR8	4
Delight	Pixel Cube Subject 4	PSNR6.22	4
Relight	Pixel Cube Subject 1	PSNR10.08	4
Relight	Pixel Cube Subject 4	PSNR10.94	4

Showing 10 of 10 rows

Other info

Follow for update

@wizwand_team Discord