Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Progressive Disentangled Representation Learning for Fine-Grained Controllable Talking Head Synthesis

About

We present a novel one-shot talking head synthesis method that achieves disentangled and fine-grained control over lip motion, eye gaze&blink, head pose, and emotional expression. We represent different motions via disentangled latent representations and leverage an image generator to synthesize talking heads from them. To effectively disentangle each motion factor, we propose a progressive disentangled representation learning strategy by separating the factors in a coarse-to-fine manner, where we first extract unified motion feature from the driving signal, and then isolate each fine-grained motion from the unified feature. We introduce motion-specific contrastive learning and regressing for non-emotional motions, and feature-level decorrelation and self-reconstruction for emotional expression, to fully utilize the inherent properties of each motion factor in unstructured video data to achieve disentanglement. Experiments show that our method provides high quality speech&lip-motion synchronization along with precise and disentangled control over multiple extra facial motions, which can hardly be achieved by previous methods.

Duomin Wang, Yu Deng, Zixin Yin, Heung-Yeung Shum, Baoyuan Wang• 2022

Related benchmarks

TaskDatasetResultRank
Portrait Animation (Self-reenactment)VFHQ (test)
FVD628.5
23
Talking head synthesisUser Study
Lip Sync Quality4.44
18
Portrait Animation (Cross-reenactment)FFHQ source + VFHQ driving (test)
CSIM0.2181
18
Self-reenactment portrait animationMEAD 59 (test)
CSIM0.3256
18
Facial ReenactmentVFHQ Self-Reenactment
MSE0.047
8
Facial ReenactmentVFHQ Cross-Reenactment
ID-SIM0.756
8
Audio-driven talking head synthesisVoxCeleb2 13 (test)
LSE-C7.26
7
Audio-driven talking head synthesisMead 60 (test)
LSE-C7.24
7
Facial Component ControlMEAD 53 (test)
Pose Error33.081
5
Expression Control AccuracyVoxCeleb2
MSE0.156
4
Showing 10 of 12 rows

Other info

Code

Follow for update