Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Image Conductor: Precision Control for Interactive Video Synthesis

About

Filmmaking and animation production often require sophisticated techniques for coordinating camera transitions and object movements, typically involving labor-intensive real-world capturing. Despite advancements in generative AI for video creation, achieving precise control over motion for interactive video asset generation remains challenging. To this end, we propose Image Conductor, a method for precise control of camera transitions and object movements to generate video assets from a single image. An well-cultivated training strategy is proposed to separate distinct camera and object motion by camera LoRA weights and object LoRA weights. To further address cinematographic variations from ill-posed trajectories, we introduce a camera-free guidance technique during inference, enhancing object movements while eliminating camera transitions. Additionally, we develop a trajectory-oriented video motion data curation pipeline for training. Quantitative and qualitative experiments demonstrate our method's precision and fine-grained control in generating motion-controllable videos from images, advancing the practical application of interactive video synthesis. Project webpage available at https://liyaowei-stu.github.io/project/ImageConductor/

Yaowei Li, Xintao Wang, Zhaoyang Zhang, Zhouxia Wang, Ziyang Yuan, Liangbin Xie, Yuexian Zou, Ying Shan• 2024

Related benchmarks

TaskDatasetResultRank
Video GenerationDAVIS (val)
PSNR12.513
18
Track-Conditioned Video GenerationDAVIS (val)
PSNR12.184
12
Video GenerationMoveBench
FID34.5
5
Multi-object motion controlMoveBench multi-object 1.0 (test)
FID77.5
3
Image-to-Video GenerationHumanVid curated 500 real-world videos (evaluation set)
LPIPS0.43
2
Showing 5 of 5 rows

Other info

Follow for update