CalibAnyView: Beyond Single-View Camera Calibration in the Wild

About

Camera calibration is a fundamental prerequisite for reliable geometric perception, yet classical approaches rely on controlled acquisition setups that are impractical for in-the-wild imagery. Recent learning-based methods have shown promising results for single-view calibration, but inherently neglect geometric consistency across multiple views. We introduce CalibAnyView, a unified formulation that supports an arbitrary number of input views ($N \geq 1$) by explicitly modeling cross-view geometric consistency. To facilitate this, we construct a large-scale multi-view video dataset covering diverse real-world scenarios, including multiple camera models, dynamic scenes, realistic motion trajectories, and heterogeneous lens distortions. Building on this dataset, we develop a multi-view transformer that predicts dense perspective fields, which are further integrated into a geometric optimization framework to jointly estimate camera intrinsics and gravity direction. Extensive experiments demonstrate that CalibAnyView consistently outperforms state-of-the-art methods, achieves strong robustness under single-view settings, and further improves with multi-view inference, providing a reliable foundation for downstream tasks such as 3D reconstruction and robotic perception in the wild.

Boying Li, Cheng Zhang, Weirong Chen, Daniel Cremers, Ian Reid, Hamid Rezatofighi• 2026

Related benchmarks

Task	Dataset	Result
Camera Understanding	MegaDepth	FoV AUC@1°14.8	31
Camera Understanding	Stanford2D3D	FoV AUC (Threshold 1°)27.1	26
Camera Understanding	TartanAir	FoV AUC@1°21.9	26
Camera Understanding	LaMAR	FoV AUC@1°24.6	26
Camera Calibration	Proposed Dataset (test)	Field of View (FoV)4.54	11
Multi-view camera calibration	Stanford2D3D 157 windows	vFoV Error [°]3.37	7
Multi-view camera calibration	TartanAir 205 windows	Vertical Field of View Error3.12	7

Showing 7 of 7 rows

Other info

Follow for update

@wizwand_team Discord