Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Geo-EVS: Geometry-Conditioned Extrapolative View Synthesis for Autonomous Driving

About

Extrapolative novel view synthesis can reduce camera-rig dependency in autonomous driving by generating standardized virtual views from heterogeneous sensors. Existing methods degrade outside recorded trajectories because extrapolated poses provide weak geometric support and no dense target-view supervision. The key is to explicitly expose the model to out-of-trajectory condition defects during training. We propose Geo-EVS, a geometry-conditioned framework under sparse supervision. Geo-EVS has two components. Geometry-Aware Reprojection (GAR) uses fine-tuned VGGT to reconstruct colored point clouds and reproject them to observed and virtual target poses, producing geometric condition maps. This design unifies the reprojection path between training and inference. Artifact-Guided Latent Diffusion (AGLD) injects reprojection-derived artifact masks during training so the model learns to recover structure under missing support. For evaluation, we use a LiDAR-Projected Sparse-Reference (LPSR) protocol when dense extrapolated-view ground truth is unavailable. On Waymo, Geo-EVS improves sparse-view synthesis quality and geometric accuracy, especially in high-angle and low-coverage settings. It also improves downstream 3D detection.

Yatong Lan, Rongkui Tang, Lei He• 2026

Related benchmarks

TaskDatasetResultRank
New View SynthesisWaymo (val)
PSNR (dB)23.42
14
Observed-View ReconstructionWaymo (Observed-view)
FID3.9
7
Extrapolative View SynthesisWaymo extrapolation
Sparse PSNR23.65
5
3D Object DetectionWaymo mini 1 5 (val)
LET-mAPL35.9
2
Showing 4 of 4 rows

Other info

Follow for update