LivePortrait: Efficient Portrait Animation with Stitching and Retargeting Control
About
Portrait Animation aims to synthesize a lifelike video from a single source image, using it as an appearance reference, with motion (i.e., facial expressions and head pose) derived from a driving video, audio, text, or generation. Instead of following mainstream diffusion-based methods, we explore and extend the potential of the implicit-keypoint-based framework, which effectively balances computational efficiency and controllability. Building upon this, we develop a video-driven portrait animation framework named LivePortrait with a focus on better generalization, controllability, and efficiency for practical usage. To enhance the generation quality and generalization ability, we scale up the training data to about 69 million high-quality frames, adopt a mixed image-video training strategy, upgrade the network architecture, and design better motion transformation and optimization objectives. Additionally, we discover that compact implicit keypoints can effectively represent a kind of blendshapes and meticulously propose a stitching and two retargeting modules, which utilize a small MLP with negligible computational overhead, to enhance the controllability. Experimental results demonstrate the efficacy of our framework even compared to diffusion-based methods. The generation speed remarkably reaches 12.8ms on an RTX 4090 GPU with PyTorch. The inference code and models are available at https://github.com/KwaiVGI/LivePortrait
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Cross-Reenactment | VOODOO-XP (test) | MEt3R0.033 | 10 | |
| Self-Reenactment | VOODOO-XP (test) | MEt3R0.032 | 10 | |
| Self-Reenactment | HDTF (test) | LPIPS0.1817 | 8 | |
| Cross-Reenactment | TalkingHead-1KH and LV100 (test) | ID-SIM0.723 | 7 | |
| Self-Reenactment | TalkingHead-1KH and LV100 (test) | L1 Loss0.043 | 7 | |
| Portrait Animation | Self-Reenactment (test) | PSNR27.76 | 6 | |
| Self-Reenactment | RAVDESS | PSNR23.2897 | 6 | |
| Portrait Animation | Cross-Reenactment (test) | CSIM0.459 | 6 | |
| Cross-Reenactment | NeRSemble | AED0.2864 | 6 | |
| Portrait Animation | Disentangled-Reenactment (test) | CSIM45.8 | 4 |