Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

AnyView: Synthesizing Any Novel View in Dynamic Scenes

About

Modern generative video models excel at producing convincing, high-quality outputs, but struggle to maintain multi-view and spatiotemporal consistency in highly dynamic real-world environments. In this work, we introduce \textbf{AnyView}, a diffusion-based video generation framework for \emph{dynamic view synthesis} with minimal inductive biases or geometric assumptions. We leverage multiple data sources with various levels of supervision, including monocular (2D), multi-view static (3D) and multi-view dynamic (4D) datasets, to train a generalist spatiotemporal implicit representation capable of producing zero-shot novel videos from arbitrary camera locations and trajectories. We evaluate AnyView on standard benchmarks, showing competitive results with the current state of the art, and propose \textbf{AnyViewBench}, a challenging new benchmark tailored towards \emph{extreme} dynamic view synthesis in diverse real-world scenarios. In this more dramatic setting, we find that most baselines drastically degrade in performance, as they require significant overlap between viewpoints, while AnyView maintains the ability to produce realistic, plausible, and spatiotemporally consistent videos when prompted from \emph{any} viewpoint. Results, data, code, and models can be viewed at: https://tri-ml.github.io/AnyView/

Basile Van Hoorick, Dian Chen, Shun Iwase, Pavel Tokmakov, Muhammad Zubair Irshad, Igor Vasiljevic, Swati Gupta, Fangzhou Cheng, Sergey Zakharov, Vitor Campagnolo Guizilini• 2026

Related benchmarks

TaskDatasetResultRank
Narrow Dynamic View SynthesisKubric-4D gradual 1.0 (test)
PSNR21.21
7
Narrow Dynamic View SynthesisDyCheck iPhone 1.0 (test)
PSNR13.47
7
Narrow Dynamic View SynthesisParDom-4D gradual 1.0 (test)
PSNR26.29
6
Showing 3 of 3 rows

Other info

Follow for update