Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Image to Sphere: Learning Equivariant Features for Efficient Pose Prediction

About

Predicting the pose of objects from a single image is an important but difficult computer vision problem. Methods that predict a single point estimate do not predict the pose of objects with symmetries well and cannot represent uncertainty. Alternatively, some works predict a distribution over orientations in $\mathrm{SO}(3)$. However, training such models can be computation- and sample-inefficient. Instead, we propose a novel mapping of features from the image domain to the 3D rotation manifold. Our method then leverages $\mathrm{SO}(3)$ equivariant layers, which are more sample efficient, and outputs a distribution over rotations that can be sampled at arbitrary resolution. We demonstrate the effectiveness of our method at object orientation prediction, and achieve state-of-the-art performance on the popular PASCAL3D+ dataset. Moreover, we show that our method can model complex object symmetries, without any modifications to the parameters or loss function. Code is available at https://dmklee.github.io/image2sphere.

David M. Klee, Ondrej Biza, Robert Platt, Robin Walters• 2023

Related benchmarks

TaskDatasetResultRank
Rotation PredictionPASCAL3D+ (test)
Average Rotation Error9.8
10
Rotation PredictionModelNet10-SO(3) (test)
Avg Rotation Error16.3
9
Pose EstimationSYMSOL I (test)
Avg Log Likelihood (avg)3.41
6
Pose EstimationSYMSOL II (test)
Avg Log Likelihood (avg)4.84
6
Showing 4 of 4 rows

Other info

Follow for update