Wid3R: Wide Field-of-View 3D Reconstruction via Camera Model Conditioning

About

We present Wid3R, a feed-forward neural network for multi-view visual geometry reconstruction that supports wide field-of-view camera models. Unlike existing methods that assume rectified or pinhole inputs, Wid3R directly models wide-angle imagery without explicit calibration or undistortion. Our approach leverages a ray-based representation with spherical harmonics and introduces a novel camera model token to enable distortion-aware reconstruction. To the best of our knowledge, Wid3R is the first multi-frame feed-forward 3D reconstruction method that supports 360 imagery. Moreover, we show that conditioning on diverse camera types improves generalization to 360 scenes and alleviates data sparsity issues. Wid3R achieves significant performance gains, improving AUC@30 by up to +33.67 on Zip-NeRF (fisheye) and +77.33 on Stanford2D3D (360).

Dongki Jung, Jaehoon Choi, Adil Qureshi, Somi Jeong, Dinesh Manocha, Suyong Yeon• 2026

Related benchmarks

Task	Dataset	Result
Monocular 360 Depth Estimation	Matterport3D official (test)	Delta Acc (1.25x)94.8	20
Point Map Estimation	ScanNet++	--	16
Large-scale Localization	Matterport3D 2t7WUuJeko7	Registration Count37	6
Large-scale Localization	Matterport3D 8194nk5LbLH	Registration Count Success Rate100	6
Large-scale Localization	Matterport3D pLe4wQe7qrG	Registered Count31	6
Camera pose estimation	Zip-NeRF (test)	ATE0.49	3
Camera pose estimation	FIORD Kitchen_In, meetingroom, and parakennus scenes	ATE0.44	3
Camera pose estimation	FIORD	RRA@30100	3
Camera pose estimation	Stanford2D3D (area_5a and area_5b)	RRA@3094.05	3
Point Map Estimation	Matterport3D	Mean Accuracy9.4	3

Showing 10 of 11 rows

Other info

Follow for update

@wizwand_team Discord