FNeVR: Neural Volume Rendering for Face Animation
About
Face animation, one of the hottest topics in computer vision, has achieved a promising performance with the help of generative models. However, it remains a critical challenge to generate identity preserving and photo-realistic images due to the sophisticated motion deformation and complex facial detail modeling. To address these problems, we propose a Face Neural Volume Rendering (FNeVR) network to fully explore the potential of 2D motion warping and 3D volume rendering in a unified framework. In FNeVR, we design a 3D Face Volume Rendering (FVR) module to enhance the facial details for image rendering. Specifically, we first extract 3D information with a well-designed architecture, and then introduce an orthogonal adaptive ray-sampling module for efficient rendering. We also design a lightweight pose editor, enabling FNeVR to edit the facial pose in a simple yet effective way. Extensive experiments show that our FNeVR obtains the best overall quality and performance on widely used talking-head benchmarks.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Cross-identity face animation | VoxCeleb 1 | ARD2.755 | 9 | |
| Video self-reconstruction | VoxCeleb1 (test) | L1 Loss0.0404 | 9 | |
| Same-identity reconstruction | VoxCeleb 1 (test) | L1 Loss0.0404 | 7 | |
| Cross-identity Face Reenactment | VoxCeleb (test) | FID98.23 | 4 | |
| Cross-identity Face Reenactment | VoxCeleb2 (test) | FID133.9 | 4 | |
| Facial Reenactment | VoxCeleb | FLOPs (G)130.1 | 4 |