PoseTraj: Pose-Aware Trajectory Control in Video Diffusion

About

Recent advancements in trajectory-guided video generation have achieved notable progress. However, existing models still face challenges in generating object motions with potentially changing 6D poses under wide-range rotations, due to limited 3D understanding. To address this problem, we introduce PoseTraj, a pose-aware video dragging model for generating 3D-aligned motion from 2D trajectories. Our method adopts a novel two-stage pose-aware pretraining framework, improving 3D understanding across diverse trajectories. Specifically, we propose a large-scale synthetic dataset PoseTraj-10K, containing 10k videos of objects following rotational trajectories, and enhance the model perception of object pose changes by incorporating 3D bounding boxes as intermediate supervision signals. Following this, we fine-tune the trajectory-controlling module on real-world videos, applying an additional camera-disentanglement module to further refine motion accuracy. Experiments on various benchmark datasets demonstrate that our method not only excels in 3D pose-aligned dragging for rotational trajectories but also outperforms existing baselines in trajectory accuracy and video quality.

Longbin Ji, Lei Zhong, Pengfei Wei, Changjian Li• 2025

Related benchmarks

Task	Dataset	Result
Trajectory-based image animation	WebVid (test)	LPIPS0.1704	8
Trajectory-guided video generation	VIPSeg (val)	ObjMC87.56	6
Trajectory-guided video generation	DAVIS (test)	ObjMC29.92	3

Showing 3 of 3 rows

Other info

Code

Follow for update

@wizwand_team Discord