Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

IDOL: Instant Photorealistic 3D Human Creation from a Single Image

About

Creating a high-fidelity, animatable 3D full-body avatar from a single image is a challenging task due to the diverse appearance and poses of humans and the limited availability of high-quality training data. To achieve fast and high-quality human reconstruction, this work rethinks the task from the perspectives of dataset, model, and representation. First, we introduce a large-scale HUman-centric GEnerated dataset, HuGe100K, consisting of 100K diverse, photorealistic sets of human images. Each set contains 24-view frames in specific human poses, generated using a pose-controllable image-to-multi-view model. Next, leveraging the diversity in views, poses, and appearances within HuGe100K, we develop a scalable feed-forward transformer model to predict a 3D human Gaussian representation in a uniform space from a given human image. This model is trained to disentangle human pose, body shape, clothing geometry, and texture. The estimated Gaussians can be animated without post-processing. We conduct comprehensive experiments to validate the effectiveness of the proposed dataset and method. Our model demonstrates the ability to efficiently reconstruct photorealistic humans at 1K resolution from a single input image using a single GPU instantly. Additionally, it seamlessly supports various applications, as well as shape and texture editing tasks. Project page: https://yiyuzhuang.github.io/IDOL/.

Yiyu Zhuang, Jiaxi Lv, Hao Wen, Qing Shuai, Ailing Zeng, Hao Zhu, Shifeng Chen, Yujiu Yang, Xun Cao, Wei Liu• 2024

Related benchmarks

TaskDatasetResultRank
3D human reconstructionTHuman 2.1 (test)
PSNR18.063
16
3D human reconstruction2K2K
PSNR27.18
9
3D human reconstructionCustomHuman
PSNR31.02
9
Human Motion and View SynthesisHuGe100K (user study)
Identity & Appearance Preservation32.9
5
Human Image SynthesisHuGe100K in-the-wild (test)
User Preference Score15
5
Novel View SynthesisBEHAVE Novel View
PSNR17.38
5
Static ReconstructionYouTube Static Occluded
PSNR17.31
5
Static ReconstructionYouTube Static Canonical
PSNR19.78
5
3D human reconstructionHuGe100K and THuman2.1 (test)
MSE0.008
4
3D Human Reconstruction User StudyHumanLRM 50 cases (test)
Face Perceptual Score52.28
4
Showing 10 of 22 rows

Other info

Code

Follow for update