Diff2Lip: Audio Conditioned Diffusion Models for Lip-Synchronization

About

The task of lip synchronization (lip-sync) seeks to match the lips of human faces with different audio. It has various applications in the film industry as well as for creating virtual avatars and for video conferencing. This is a challenging problem as one needs to simultaneously introduce detailed, realistic lip movements while preserving the identity, pose, emotions, and image quality. Many of the previous methods trying to solve this problem suffer from image quality degradation due to a lack of complete contextual information. In this paper, we present Diff2Lip, an audio-conditioned diffusion-based model which is able to do lip synchronization in-the-wild while preserving these qualities. We train our model on Voxceleb2, a video dataset containing in-the-wild talking face videos. Extensive studies show that our method outperforms popular methods like Wav2Lip and PC-AVS in Fr\'echet inception distance (FID) metric and Mean Opinion Scores (MOS) of the users. We show results on both reconstruction (same audio-video inputs) as well as cross (different audio-video inputs) settings on Voxceleb2 and LRW datasets. Video results and code can be accessed from our project page ( https://soumik-kanad.github.io/diff2lip ).

Soumik Mukhopadhyay, Saksham Suri, Ravi Teja Gadde, Abhinav Shrivastava• 2023

Related benchmarks

Task	Dataset	Result
Talking Head Generation	HDTF	FID13.45	48
Visual Dubbing	ContextDubBench 1.0 (test)	FID17.126	18
Talking Head Generation	TalkVid self-driven 30 clips (held-out)	FVD178.6	16
Talking Head Generation	CelebV-HQ	FID16.98	15
Lip synchronization	HDTF 52 (test)	Sync-C8.35	12
Visual Dubbing	HDTF (test)	PSNR28.716	9
Visual Dubbing	User Study	Realism2.63	9
Audio-driven Lip Synchronization	HDTF (Cross-identity)	Sync-C Score8.3	8
Lip-audio synchronization	HDTF, CelebV-HQ, and CelebV-Text	FPS19.77	8
Cross-Audio Talking Head Generation	HDTF, CelebV-HQ, and CelebV-Text 100 cross-audio pairs	FID9.55	8

Showing 10 of 20 rows

Other info

Follow for update

@wizwand_team Discord