Portrait4D: Learning One-Shot 4D Head Avatar Synthesis using Synthetic Data

About

Existing one-shot 4D head synthesis methods usually learn from monocular videos with the aid of 3DMM reconstruction, yet the latter is evenly challenging which restricts them from reasonable 4D head synthesis. We present a method to learn one-shot 4D head synthesis via large-scale synthetic data. The key is to first learn a part-wise 4D generative model from monocular images via adversarial learning, to synthesize multi-view images of diverse identities and full motions as training data; then leverage a transformer-based animatable triplane reconstructor to learn 4D head reconstruction using the synthetic data. A novel learning strategy is enforced to enhance the generalizability to real images by disentangling the learning process of 3D reconstruction and reenactment. Experiments demonstrate our superiority over the prior art.

Yu Deng, Duomin Wang, Xiaohang Ren, Xingyu Chen, Baoyuan Wang• 2023

Related benchmarks

Task	Dataset	Result
Self-Reenactment	HDTF	PSNR20.81	35
Cross-identity reenactment	VFHQ (test)	CSIM0.596	23
Self-Reenactment	VFHQ (test)	PSNR20.35	23
Cross-Reenactment	HDTF	CSIM79.3	21
Video-driven Talking Head Generation (Self-Reenactment)	HDTF	FID36.57	12
3D Portrait Animation (Cross Reenactment)	VFHQ 1.0 (test)	CSIM59.6	11
Cross-identity reenactment	HDTF 55 (test)	CSIM0.7873	8
Self-Reenactment	HDTF 55 (test)	PSNR20.03	8
Cross-Reenactment	CelebV-HQ 69 (inference)	FID57.13	7
Video-driven Talking Head Generation (Cross-Reenactment)	HDTF	FID42.82	7

Showing 10 of 17 rows

Other info

Follow for update

@wizwand_team Discord