HunyuanPortrait: Implicit Condition Control for Enhanced Portrait Animation

About

We introduce HunyuanPortrait, a diffusion-based condition control method that employs implicit representations for highly controllable and lifelike portrait animation. Given a single portrait image as an appearance reference and video clips as driving templates, HunyuanPortrait can animate the character in the reference image by the facial expression and head pose of the driving videos. In our framework, we utilize pre-trained encoders to achieve the decoupling of portrait motion information and identity in videos. To do so, implicit representation is adopted to encode motion information and is employed as control signals in the animation phase. By leveraging the power of stable video diffusion as the main building block, we carefully design adapter layers to inject control signals into the denoising unet through attention mechanisms. These bring spatial richness of details and temporal consistency. HunyuanPortrait also exhibits strong generalization performance, which can effectively disentangle appearance and motion under different image styles. Our framework outperforms existing methods, demonstrating superior temporal consistency and controllability. Our project is available at https://kkakkkka.github.io/HunyuanPortrait.

Zunnan Xu, Zhentao Yu, Zixiang Zhou, Jun Zhou, Xiaoyu Jin, Fa-Ting Hong, Xiaozhong Ji, Junwei Zhu, Chengfei Cai, Shiyu Tang, Qin Lin, Xiu Li, Qinglin Lu• 2025

Related benchmarks

Task	Dataset	Result
Portrait Image Animation	HDTF (test)	FID18.68	23
Portrait Animation (Self-reenactment)	VFHQ (test)	FVD266.7	23
Self-reenactment portrait animation	MEAD 59 (test)	CSIM0.922	18
Portrait Animation (Cross-reenactment)	FFHQ source + VFHQ driving (test)	CSIM0.5939	18
Portrait Animation	EMH benchmark	BRISQUE49.8	11
Self-Reenactment	VOODOO-XP (test)	MEt3R0.028	10
Cross-Reenactment	VOODOO-XP (test)	MEt3R0.032	10
Facial Expression Editing	MetaHuman-based Enhancement Mode (test)	PSNR22.7968	10
Facial Expression Editing	MetaHuman-based benchmark Replacement Mode (test)	PSNR22.4287	10
Facial Reenactment	VFHQ Cross-Reenactment	ID-SIM0.866	8

Showing 10 of 29 rows

Other info

Follow for update

@wizwand_team Discord