TrajectoryCrafter: Redirecting Camera Trajectory for Monocular Videos via Diffusion Models
About
We present TrajectoryCrafter, a novel approach to redirect camera trajectories for monocular videos. By disentangling deterministic view transformations from stochastic content generation, our method achieves precise control over user-specified camera trajectories. We propose a novel dual-stream conditional video diffusion model that concurrently integrates point cloud renders and source videos as conditions, ensuring accurate view transformations and coherent 4D content generation. Instead of leveraging scarce multi-view videos, we curate a hybrid training dataset combining web-scale monocular videos with static multi-view datasets, by our innovative double-reprojection strategy, significantly fostering robust generalization across diverse scenes. Extensive evaluations on multi-view and large-scale monocular videos demonstrate the superior performance of our method.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Video Generation | VBench | -- | 102 | |
| Single-object 4D Motion Generation | User Study Single-object 4D Motion Generation 1.0 (test) | Prompt Alignment5 | 36 | |
| View Synchronization | Basic Benchmark (test) | FVD665.9 | 20 | |
| Video Generation | RealEstate10K and DL3DV partial-revisit (evaluation) | Total Quality Score76.34 | 11 | |
| I2V Camera Control | DL3DV (test) | RRE1.08 | 10 | |
| Video Generation | RealEstate10K (Re10K) (test) | PSNR16.94 | 8 | |
| Camera control | UltraVideo (test) | DINO0.0376 | 7 | |
| Narrow Dynamic View Synthesis | DyCheck iPhone 1.0 (test) | PSNR14.24 | 7 | |
| Narrow Dynamic View Synthesis | Kubric-4D gradual 1.0 (test) | PSNR20.93 | 7 | |
| 4D Scene Motion Generation | Six diverse dynamic scenes animation set 1.0 (test) | Alignment9.6 | 6 |