Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Monocular Dynamic View Synthesis: A Reality Check

About

We study the recent progress on dynamic view synthesis (DVS) from monocular video. Though existing approaches have demonstrated impressive results, we show a discrepancy between the practical capture process and the existing experimental protocols, which effectively leaks in multi-view signals during training. We define effective multi-view factors (EMFs) to quantify the amount of multi-view signal present in the input capture sequence based on the relative camera-scene motion. We introduce two new metrics: co-visibility masked image metrics and correspondence accuracy, which overcome the issue in existing protocols. We also propose a new iPhone dataset that includes more diverse real-life deformation sequences. Using our proposed experimental protocol, we show that the state-of-the-art approaches observe a 1-2 dB drop in masked PSNR in the absence of multi-view cues and 4-5 dB drop when modeling complex motion. Code and data can be found at https://hangg7.com/dycheck.

Hang Gao, Ruilong Li, Shubham Tulsiani, Bryan Russell, Angjoo Kanazawa• 2022

Related benchmarks

TaskDatasetResultRank
Novel View SynthesisiPhone DyCheck 7 scenes 2x resolution
mPSNR16.96
31
4D ReconstructionDyCheck (test)
mPSNR16.96
21
Novel View SynthesisDyCheck (test)
mPSNR16.96
15
Novel View SynthesisNvidia Dataset
PSNR23.241
15
Novel View SynthesisiPhone dataset (test)
Mean CLIP-I86.04
13
Dynamic View SynthesisDyCheck 5 scenes, 1x resolution 1.0 (test)
mLPIPS0.55
11
Novel View SynthesisDyCheck 1.0 (novel view)
PSNR15.6
9
Novel View SynthesisiPhone dataset Block
CLIP Image Similarity0.8873
7
Novel View SynthesisiPhone (Apple)
CLIP-I0.8275
7
Novel View SynthesisiPhone dataset Mean
CLIP-I86.04
7
Showing 10 of 13 rows

Other info

Code

Follow for update