Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

DynamicScaler: Seamless and Scalable Video Generation for Panoramic Scenes

About

The increasing demand for immersive AR/VR applications and spatial intelligence has heightened the need to generate high-quality scene-level and 360${\deg}$ panoramic video. However, most video diffusion models are constrained by limited resolution and aspect ratio, which restricts their applicability to scene-level dynamic content synthesis. In this work, we propose $\textbf{DynamicScaler}$, addressing these challenges by enabling spatially scalable and panoramic dynamic scene synthesis that preserves coherence across panoramic scenes of arbitrary size. Specifically, we introduce a Offset Shifting Denoiser, facilitating efficient, synchronous, and coherent denoising panoramic dynamic scenes via a diffusion model with fixed resolution through a seamless rotating Window, which ensures seamless boundary transitions and consistency across the entire panoramic space, accommodating varying resolutions and aspect ratios. Additionally, we employ a Global Motion Guidance mechanism to ensure both local detail fidelity and global motion continuity. Extensive experiments demonstrate our method achieves superior content and motion quality in panoramic scene-level video generation, offering a training-free, efficient, and scalable solution for immersive dynamic scene creation with constant VRAM consumption regardless of the output video resolution. Project page is available at $\href{https://dynamic-scaler.pages.dev/new}{https://dynamic-scaler.pages.dev/new}$.

Jinxiu Liu, Shaoheng Lin, Yinxiao Li, Ming-Hsuan Yang• 2024

Related benchmarks

TaskDatasetResultRank
Image-to-VideoVBench I2V
Average VBench Score0.865
6
Image-to-VideoFrescoArchive
Average VBench Score0.871
6
Showing 2 of 2 rows

Other info

Follow for update