Efficient Scene Compression for Visual-based Localization
About
Estimating the pose of a camera with respect to a 3D reconstruction or scene representation is a crucial step for many mixed reality and robotics applications. Given the vast amount of available data nowadays, many applications constrain storage and/or bandwidth to work efficiently. To satisfy these constraints, many applications compress a scene representation by reducing its number of 3D points. While state-of-the-art methods use $K$-cover-based algorithms to compress a scene, they are slow and hard to tune. To enhance speed and facilitate parameter tuning, this work introduces a novel approach that compresses a scene representation by means of a constrained quadratic program (QP). Because this QP resembles a one-class support vector machine, we derive a variant of the sequential minimal optimization to solve it. Our approach uses the points corresponding to the support vectors as the subset of points to represent a scene. We also present an efficient initialization method that allows our method to converge quickly. Our experiments on publicly available datasets show that our approach compresses a scene representation quickly while delivering accurate pose estimates.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Visual Localization | Cambridge Landmarks Church | Median Translation Error (m)0.89 | 23 | |
| Visual Localization | Cambridge Landmarks College | Median Translation Error (m)1.09 | 23 | |
| Camera pose estimation | Aachen (Night) | Success Rate (0.25m/2°)16.3 | 14 | |
| Camera Localization | Aachen Day | Acc @ (0.25m, 2°)62.6 | 10 | |
| Visual Localization | Cambridge Landmarks ShopFacade | Median Translation Error1.4 | 9 | |
| Visual Localization | Cambridge Landmarks OldHospital | Median Translation Error (m)2.17 | 9 |