Pose Refinement with Joint Optimization of Visual Points and Lines
About
High-precision camera re-localization technology in a pre-established 3D environment map is the basis for many tasks, such as Augmented Reality, Robotics and Autonomous Driving. The point-based visual re-localization approaches are well-developed in recent decades, but are insufficient in some feature-less cases. In this paper, we design a complete pipeline for camera pose refinement with points and lines, which contains the innovatively designed line extracting CNN named VLSE, the line matching and the pose optimization approaches. We adopt a novel line representation and customize a hybrid convolution block based on the Stacked Hourglass network, to detect accurate and stable line features on images. Then we apply a geometric-based strategy to obtain precise 2D-3D line correspondences using epipolar constraint and reprojection filtering. A following point-line joint cost function is constructed to optimize the camera pose with the initial coarse pose from the pure point-based localization. Sufficient experiments are conducted on open datasets, i.e, line extractor on Wireframe and YorkUrban, localization performance on InLoc duc1 and duc2, to confirm the effectiveness of our point-line joint pose optimization method.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Visual Localization | 7Scenes Stairs | Median Translation Error (cm)4.8 | 25 | |
| Visual Localization | 7Scenes Heads | Median Translation Error (cm)1.2 | 25 | |
| Visual Localization | 7Scenes (Office) | Median Translation Error (cm)3.2 | 25 | |
| Visual Localization | 7Scenes Chess | Median Translation Error (cm)2.4 | 25 | |
| Visual Localization | 7Scenes Fire | Median Translation Error (cm)2.3 | 25 | |
| Visual Localization | 7Scenes Pumpkin | Median Translation Error (cm)5.1 | 25 | |
| Visual Localization | 7Scenes RedKitchen | Median Translation Error (cm)4.3 | 25 |