PoseDiffusion: Solving Pose Estimation via Diffusion-aided Bundle Adjustment

About

Camera pose estimation is a long-standing computer vision problem that to date often relies on classical methods, such as handcrafted keypoint matching, RANSAC and bundle adjustment. In this paper, we propose to formulate the Structure from Motion (SfM) problem inside a probabilistic diffusion framework, modelling the conditional distribution of camera poses given input images. This novel view of an old problem has several advantages. (i) The nature of the diffusion framework mirrors the iterative procedure of bundle adjustment. (ii) The formulation allows a seamless integration of geometric constraints from epipolar geometry. (iii) It excels in typically difficult scenarios such as sparse views with wide baselines. (iv) The method can predict intrinsics and extrinsics for an arbitrary amount of images. We demonstrate that our method PoseDiffusion significantly improves over the classic SfM pipelines and the learned approaches on two real-world datasets. Finally, it is observed that our method can generalize across datasets without further training. Project page: https://posediffusion.github.io/

Jianyuan Wang, Christian Rupprecht, David Novotny• 2023

Related benchmarks

Task	Dataset	Result
Multi-view pose regression	CO3D v2	RRA@1580.5	31
6D Object Pose Estimation	Toyota-Light (TOYL) (test)	AR8.1	24
Camera pose estimation	IMC	AUC (3° Threshold)0.1231	20
Multi-View Pose Estimation	ScanNet supervised (test)	RRE96.7	18
Multi-view pose regression	RealEstate10K	mAA(30)48	15
Camera pose estimation	CO3D 10-view v2	RRA@1553.2	12
Relative Camera Pose Estimation	CO3D v2 (test)	RRA@1580.5	12
Visual Localization	Chang'e-3 Real Flight Dataset (test)	Translational Error20.7	11
Visual Localization	Synthetic Dataset (T1)	Translational Error (m)20.65	11
Visual Localization	Synthetic Dataset (T2)	Translation Error (m)25.88	11

Showing 10 of 17 rows

Other info

Follow for update

@wizwand_team Discord