Bringing Your Portrait to 3D Presence

About

We present a unified framework for reconstructing animatable 3D human avatars from a single portrait across head, half-body, and full-body inputs. Our method tackles three bottlenecks: pose- and framing-sensitive feature representations, limited scalable data, and unreliable proxy-mesh estimation. We introduce a Dual-UV representation that maps image features to a canonical UV space via Core-UV and Shell-UV branches, eliminating pose- and framing-induced token shifts. We also build a factorized synthetic data manifold combining 2D generative diversity with geometry-consistent 3D renderings, supported by a training scheme that improves realism and identity consistency. A robust proxy-mesh tracker maintains stability under partial visibility. Together, these components enable strong in-the-wild generalization. Trained only on half-body synthetic data, our model achieves state-of-the-art head and upper-body reconstruction and competitive full-body results. Extensive experiments and analyses further validate the effectiveness of our approach.

Jiawei Zhang, Lei Chu, Jiahao Li, Zhenyu Zang, Chong Li, Xiao Li, Xun Cao, Hao Zhu, Yan Lu• 2025

Related benchmarks

Task	Dataset	Result
Human Reenactment	Upper-Wild (test)	PSNR21.09	4
Human Reenactment	Upper-Wan (test)	PSNR20.38	4
Human Reenactment	Full-Body (test)	PSNR24.53	3
Human Reenactment	Head (test)	PSNR19.04	3

Showing 4 of 4 rows

Other info

Follow for update

@wizwand_team Discord