Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

LivePortrait: Efficient Portrait Animation with Stitching and Retargeting Control

About

Portrait Animation aims to synthesize a lifelike video from a single source image, using it as an appearance reference, with motion (i.e., facial expressions and head pose) derived from a driving video, audio, text, or generation. Instead of following mainstream diffusion-based methods, we explore and extend the potential of the implicit-keypoint-based framework, which effectively balances computational efficiency and controllability. Building upon this, we develop a video-driven portrait animation framework named LivePortrait with a focus on better generalization, controllability, and efficiency for practical usage. To enhance the generation quality and generalization ability, we scale up the training data to about 69 million high-quality frames, adopt a mixed image-video training strategy, upgrade the network architecture, and design better motion transformation and optimization objectives. Additionally, we discover that compact implicit keypoints can effectively represent a kind of blendshapes and meticulously propose a stitching and two retargeting modules, which utilize a small MLP with negligible computational overhead, to enhance the controllability. Experimental results demonstrate the efficacy of our framework even compared to diffusion-based methods. The generation speed remarkably reaches 12.8ms on an RTX 4090 GPU with PyTorch. The inference code and models are available at https://github.com/KwaiVGI/LivePortrait

Jianzhu Guo, Dingyun Zhang, Xiaoqiang Liu, Zhizhou Zhong, Yuan Zhang, Pengfei Wan, Di Zhang• 2024

Related benchmarks

TaskDatasetResultRank
Portrait Animation (Self-reenactment)VFHQ (test)
FVD192
23
Self-reenactment portrait animationMEAD 59 (test)
CSIM0.9379
18
Portrait Animation (Cross-reenactment)FFHQ source + VFHQ driving (test)
CSIM0.6595
18
Facial Expression EditingMetaHuman-based benchmark Replacement Mode (test)
PSNR27.7296
10
Cross-ReenactmentVOODOO-XP (test)
MEt3R0.033
10
Facial Expression EditingMetaHuman-based Enhancement Mode (test)
PSNR28.0103
10
Self-ReenactmentVOODOO-XP (test)
MEt3R0.032
10
Self-ReenactmentHDTF (test)
LPIPS0.1817
8
Head Swappingcollected 30 pairs (test)
CSIM0.381
8
Cross-ReenactmentTalkingHead-1KH and LV100 (test)
ID-SIM0.723
7
Showing 10 of 20 rows

Other info

Follow for update