SuperPoint: Self-Supervised Interest Point Detection and Description
About
This paper presents a self-supervised framework for training interest point detectors and descriptors suitable for a large number of multiple-view geometry problems in computer vision. As opposed to patch-based neural networks, our fully-convolutional model operates on full-sized images and jointly computes pixel-level interest point locations and associated descriptors in one forward pass. We introduce Homographic Adaptation, a multi-scale, multi-homography approach for boosting interest point detection repeatability and performing cross-domain adaptation (e.g., synthetic-to-real). Our model, when trained on the MS-COCO generic image dataset using Homographic Adaptation, is able to repeatedly detect a much richer set of interest points than the initial pre-adapted deep model and any other traditional corner detector. The final system gives rise to state-of-the-art homography estimation results on HPatches when compared to LIFT, SIFT and ORB.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Visual Place Recognition | MSLS (val) | Recall@178.1 | 236 | |
| Visual Place Recognition | Pitts30k | Recall@187.2 | 164 | |
| Relative Pose Estimation | MegaDepth 1500 | AUC @ 5°43.52 | 104 | |
| Relative Pose Estimation | MegaDepth (test) | Pose AUC @5°51.96 | 83 | |
| Homography Estimation | HPatches | Overall Accuracy (< 1px)49.81 | 59 | |
| Visual Localization | Aachen Day-Night v1.1 (Night) | Success Rate (0.25m, 2°)69.1 | 58 | |
| Visual Localization | Aachen Day-Night v1.1 (Day) | SR (0.25m, 2°)88.3 | 50 | |
| Image Matching | Kinect 1 | MS0.45 | 38 | |
| Image Matching | Kinect 2 | Matching Score (MS)0.54 | 38 | |
| Image Matching | DeSurT (833 pairs total) | MS Score39 | 38 |