Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Realistic One-shot Mesh-based Head Avatars

About

We present a system for realistic one-shot mesh-based human head avatars creation, ROME for short. Using a single photograph, our model estimates a person-specific head mesh and the associated neural texture, which encodes both local photometric and geometric details. The resulting avatars are rigged and can be rendered using a neural network, which is trained alongside the mesh and texture estimators on a dataset of in-the-wild videos. In the experiments, we observe that our system performs competitively both in terms of head geometry recovery and the quality of renders, especially for the cross-person reenactment. See results https://samsunglabs.github.io/rome/

Taras Khakhulin, Vanessa Sklyarova, Victor Lempitsky, Egor Zakharov• 2022

Related benchmarks

TaskDatasetResultRank
Self-ReenactmentHDTF
PSNR20.51
29
Self-ReenactmentVFHQ (test)
PSNR19.96
23
Cross-identity reenactmentVFHQ (test)
CSIM0.53
23
Cross-ReenactmentHDTF
CSIM72.6
15
Video-driven Talking Head Generation (Self-Reenactment)HDTF
FID76.44
12
3D Portrait Animation (Cross Reenactment)VFHQ 1.0 (test)
CSIM49.5
11
Self-ReenactmentCelebV-HQ 69 (inference)
PSNR30.74
7
Video-driven Talking Head Generation (Self-Reenactment)NeRSemble Mono
PSNR31.07
7
Cross-ReenactmentCelebV-HQ 69 (inference)
FID78.02
7
Video-driven Talking Head Generation (Cross-Reenactment)HDTF
FID79.31
7
Showing 10 of 20 rows

Other info

Follow for update