CodeTalker: Speech-Driven 3D Facial Animation with Discrete Motion Prior

About

Speech-driven 3D facial animation has been widely studied, yet there is still a gap to achieving realism and vividness due to the highly ill-posed nature and scarcity of audio-visual data. Existing works typically formulate the cross-modal mapping into a regression task, which suffers from the regression-to-mean problem leading to over-smoothed facial motions. In this paper, we propose to cast speech-driven facial animation as a code query task in a finite proxy space of the learned codebook, which effectively promotes the vividness of the generated motions by reducing the cross-modal mapping uncertainty. The codebook is learned by self-reconstruction over real facial motions and thus embedded with realistic facial motion priors. Over the discrete motion space, a temporal autoregressive model is employed to sequentially synthesize facial motions from the input speech signal, which guarantees lip-sync as well as plausible facial expressions. We demonstrate that our approach outperforms current state-of-the-art methods both qualitatively and quantitatively. Also, a user study further justifies our superiority in perceptual quality.

Jinbo Xing, Menghan Xia, Yuechen Zhang, Xiaodong Cun, Jue Wang, Tien-Tsin Wong• 2023

Related benchmarks

Task	Dataset	Result
3D talking head generation	DualTalk (test)	FD (Expression)48.57	34
Co-speech 3D Gesture Synthesis	BEAT2 (test)	--	27
3D talking head generation	DualTalk OOD set	FD (EXP)50.05	26
3D Talking Face Generation	BIWI A (test)	LVE4.7914	16
Speech-driven 3D Facial Animation	3D Face-to-Face Interaction Dataset	Facial Dynamics Distance (FD)47.23	11
3D talking head generation	User Study	Lip Sync Accuracy (S)2.1	11
Speech-driven gesture generation	BEAT-X	--	11
Speech-Driven Facial Animation	BIWI B (test)	Lip Sync92.47	10
Speech-Driven Facial Animation	VOCA (test)	Lip Sync95.7	10
3D talking head animation	VOCASET (test)	LVE (x10^-5mm)3.9445	10

Showing 10 of 30 rows

Other info

Code

Follow for update

@wizwand_team Discord