TrajectoryCrafter: Redirecting Camera Trajectory for Monocular Videos via Diffusion Models

About

We present TrajectoryCrafter, a novel approach to redirect camera trajectories for monocular videos. By disentangling deterministic view transformations from stochastic content generation, our method achieves precise control over user-specified camera trajectories. We propose a novel dual-stream conditional video diffusion model that concurrently integrates point cloud renders and source videos as conditions, ensuring accurate view transformations and coherent 4D content generation. Instead of leveraging scarce multi-view videos, we curate a hybrid training dataset combining web-scale monocular videos with static multi-view datasets, by our innovative double-reprojection strategy, significantly fostering robust generalization across diverse scenes. Extensive evaluations on multi-view and large-scale monocular videos demonstrate the superior performance of our method.

Mark YU, Wenbo Hu, Jinbo Xing, Ying Shan• 2025

Related benchmarks

Task	Dataset	Result
Video Generation	VBench	--	126
Single-object 4D Motion Generation	User Study Single-object 4D Motion Generation 1.0 (test)	Prompt Alignment5	36
Novel View Synthesis	iPhone dataset	SSIM0.492	33
4D Scene Reconstruction	iPhone	Apple Scene Score13.88	21
View Synchronization	Basic Benchmark (test)	FVD665.9	20
Video Generation	RealEstate10K (Re10K) (test)	PSNR16.94	16
Video Generation	RealEstate10K and DL3DV partial-revisit (evaluation)	Total Quality Score76.34	11
Video Generation	DL3DV	PSNR21.42	10
I2V Camera Control	DL3DV (test)	RRE1.08	10
Dynamic Reconstruction	DyCheck	PSNR14.34	8

Showing 10 of 79 rows

...

Other info

Follow for update

@wizwand_team Discord