Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Realistic One-shot Mesh-based Head Avatars

About

We present a system for realistic one-shot mesh-based human head avatars creation, ROME for short. Using a single photograph, our model estimates a person-specific head mesh and the associated neural texture, which encodes both local photometric and geometric details. The resulting avatars are rigged and can be rendered using a neural network, which is trained alongside the mesh and texture estimators on a dataset of in-the-wild videos. In the experiments, we observe that our system performs competitively both in terms of head geometry recovery and the quality of renders, especially for the cross-person reenactment. See results https://samsunglabs.github.io/rome/

Taras Khakhulin, Vanessa Sklyarova, Victor Lempitsky, Egor Zakharov• 2022

Related benchmarks

TaskDatasetResultRank
Video-driven Talking Head Generation (Self-Reenactment)HDTF
FID76.44
12
Self-ReenactmentCelebV-HQ 69 (inference)
PSNR30.74
7
Video-driven Talking Head Generation (Self-Reenactment)NeRSemble Mono
PSNR31.07
7
Cross-ReenactmentCelebV-HQ 69 (inference)
FID78.02
7
Video-driven Talking Head Generation (Cross-Reenactment)HDTF
FID79.31
7
Video-driven Talking Head Generation (Cross-Reenactment)NeRSemble Mono
FID119.1
7
Avatar SynthesisNeRSemble Single Image (test)
PSNR15.78
5
Cross-identity Face ReenactmentCelebA-HQ
CSIM0.519
5
Face Cross-reenactmentHDTF 1.0 (test)
CSIM0.507
5
Face Self-reenactmentHDTF 1.0 (test)
PSNR18.46
5
Showing 10 of 10 rows

Other info

Follow for update