Generalizable One-shot Neural Head Avatar

About

We present a method that reconstructs and animates a 3D head avatar from a single-view portrait image. Existing methods either involve time-consuming optimization for a specific person with multiple images, or they struggle to synthesize intricate appearance details beyond the facial region. To address these limitations, we propose a framework that not only generalizes to unseen identities based on a single-view image without requiring person-specific optimization, but also captures characteristic details within and beyond the face area (e.g. hairstyle, accessories, etc.). At the core of our method are three branches that produce three tri-planes representing the coarse 3D geometry, detailed appearance of a source image, as well as the expression of a target image. By applying volumetric rendering to the combination of the three tri-planes followed by a super-resolution module, our method yields a high fidelity image of the desired identity, expression and pose. Once trained, our model enables efficient 3D head avatar reconstruction and animation via a single forward pass through a network. Experiments show that the proposed approach generalizes well to unseen validation datasets, surpassing SOTA baseline methods by a large margin on head avatar reconstruction and animation.

Xueting Li, Shalini De Mello, Sifei Liu, Koki Nagano, Umar Iqbal, Jan Kautz• 2023

Related benchmarks

Task	Dataset	Result
Self-Reenactment	HDTF	PSNR21.31	35
Self-Reenactment	VFHQ (test)	PSNR20.15	23
Cross-identity reenactment	VFHQ (test)	CSIM0.518	23
Cross-Reenactment	HDTF	CSIM73.5	21
Self-Reenactment	HDTF 55 (test)	PSNR21.33	8
Cross-identity reenactment	HDTF 55 (test)	CSIM0.7471	8
3D talking head generation	100-frame sequence (test)	FPS4.91	7
3D Portrait Reconstruction	NeRSemble (test)	Expr Score0.266	5
3D-aware talking portrait generation	NeRSemble (novel views)	FID85.63	4

Showing 9 of 9 rows

Other info

Follow for update

@wizwand_team Discord