Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Perception-as-Control: Fine-grained Controllable Image Animation with 3D-aware Motion Representation

About

Motion-controllable image animation is a fundamental task with a wide range of potential applications. Recent works have made progress in controlling camera or object motion via various motion representations, while they still struggle to support collaborative camera and object motion control with adaptive control granularity. To this end, we introduce 3D-aware motion representation and propose an image animation framework, called Perception-as-Control, to achieve fine-grained collaborative motion control. Specifically, we construct 3D-aware motion representation from a reference image, manipulate it based on interpreted user instructions, and perceive it from different viewpoints. In this way, camera and object motions are transformed into intuitive and consistent visual changes. Then, our framework leverages the perception results as motion control signals, enabling it to support various motion-related video synthesis tasks in a unified and flexible way. Experiments demonstrate the superiority of the proposed approach. For more details and qualitative results, please refer to our anonymous project webpage: https://chen-yingjie.github.io/projects/Perception-as-Control.

Yingjie Chen, Yifang Men, Yuan Yao, Miaomiao Cui, Liefeng Bo• 2025

Related benchmarks

TaskDatasetResultRank
3D Object Manipulation3DEditBench
LPIPS0.195
12
Point-Prompted SegmentationSpelkeBench
AR7.1
11
Video Reveal CompletionPREBench (test)
R-Ghost0.1746
6
Video Content PreservationPREBench (test)
P-LPIPS0.3915
6
Video Scene ExpansionPREBench (test)
Temporal Consistency (E-Temp)0.0985
6
Global Video QualityVBench (test)
Overall Score68.96
6
Joint camera and object motion controlVerseControl4D 1.0 (test)
Overall Score83.66
4
Showing 7 of 7 rows

Other info

Follow for update