SuperGlue: Learning Feature Matching with Graph Neural Networks
About
This paper introduces SuperGlue, a neural network that matches two sets of local features by jointly finding correspondences and rejecting non-matchable points. Assignments are estimated by solving a differentiable optimal transport problem, whose costs are predicted by a graph neural network. We introduce a flexible context aggregation mechanism based on attention, enabling SuperGlue to reason about the underlying 3D scene and feature assignments jointly. Compared to traditional, hand-designed heuristics, our technique learns priors over geometric transformations and regularities of the 3D world through end-to-end training from image pairs. SuperGlue outperforms other learned approaches and achieves state-of-the-art results on the task of pose estimation in challenging real-world indoor and outdoor environments. The proposed method performs matching in real-time on a modern GPU and can be readily integrated into modern SfM or SLAM systems. The code and trained weights are publicly available at https://github.com/magicleap/SuperGluePretrainedNetwork.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Visual Place Recognition | MSLS (val) | Recall@178.1 | 236 | |
| Visual Place Recognition | Pitts30k | Recall@188.7 | 164 | |
| Visual Place Recognition | Tokyo24/7 | Recall@188.2 | 146 | |
| Visual Place Recognition | MSLS Challenge | Recall@150.6 | 134 | |
| Image Retrieval | Revisited Paris (RPar) (Hard) | mAP70.4 | 115 | |
| Visual Place Recognition | Nordland | Recall@129.1 | 112 | |
| Image Retrieval | Revisited Paris (RPar) (Medium) | mAP86.2 | 100 | |
| Visual Place Recognition | Pittsburgh30k (test) | Recall@187.2 | 86 | |
| Relative Pose Estimation | MegaDepth (test) | Pose AUC @5°42.2 | 83 | |
| Visual Place Recognition | St Lucia | R@186.5 | 76 |