Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

UniAnimate-DiT: Human Image Animation with Large-Scale Video Diffusion Transformer

About

This report presents UniAnimate-DiT, an advanced project that leverages the cutting-edge and powerful capabilities of the open-source Wan2.1 model for consistent human image animation. Specifically, to preserve the robust generative capabilities of the original Wan2.1 model, we implement Low-Rank Adaptation (LoRA) technique to fine-tune a minimal set of parameters, significantly reducing training memory overhead. A lightweight pose encoder consisting of multiple stacked 3D convolutional layers is designed to encode motion information of driving poses. Furthermore, we adopt a simple concatenation operation to integrate the reference appearance into the model and incorporate the pose information of the reference image for enhanced pose alignment. Experimental results show that our approach achieves visually appearing and temporally consistent high-fidelity animations. Trained on 480p (832x480) videos, UniAnimate-DiT demonstrates strong generalization capabilities to seamlessly upscale to 720P (1280x720) during inference. The training and inference code is publicly available at https://github.com/ali-vilab/UniAnimate-DiT.

Xiang Wang, Shiwei Zhang, Longxiang Tang, Yingya Zhang, Changxin Gao, Yuehuan Wang, Nong Sang• 2025

Related benchmarks

TaskDatasetResultRank
Character Image AnimationFollow-Your-Pose V2
LPIPS0.159
15
Video GenerationTiktok (test)
SSIM0.9
11
Character Image AnimationCoDanceBench (test)
LPIPS0.579
9
Character AnimationUser Study 20 identities and 20 driving videos (test)
Video Quality0.79
9
Character AnimationDualDynamics
FVD172.3
8
Video GenerationTikTok Cross-ID
MQ3.9
7
Video GenerationTikTok dataset Self Reenactment (test)
PSNR19.76
7
Human Image AnimationTiktok (test)
Subject Consistency95.47
5
Human Image AnimationUser Study 50 participants 5-point Mean Opinion Score
VC Score3.3
5
Hand Pose EstimationHuman Image Animation
PA-MPJPE21.48
5
Showing 10 of 12 rows

Other info

Follow for update