Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

R2D2: Repeatable and Reliable Detector and Descriptor

About

Interest point detection and local feature description are fundamental steps in many computer vision applications. Classical methods for these tasks are based on a detect-then-describe paradigm where separate handcrafted methods are used to first identify repeatable keypoints and then represent them with a local descriptor. Neural networks trained with metric learning losses have recently caught up with these techniques, focusing on learning repeatable saliency maps for keypoint detection and learning descriptors at the detected keypoint locations. In this work, we argue that salient regions are not necessarily discriminative, and therefore can harm the performance of the description. Furthermore, we claim that descriptors should be learned only in regions for which matching can be performed with high confidence. We thus propose to jointly learn keypoint detection and description together with a predictor of the local descriptor discriminativeness. This allows us to avoid ambiguous areas and leads to reliable keypoint detections and descriptions. Our detection-and-description approach, trained with self-supervision, can simultaneously output sparse, repeatable and reliable keypoints that outperforms state-of-the-art detectors and descriptors on the HPatches dataset. It also establishes a record on the recently released Aachen Day-Night localization dataset.

Jerome Revaud, Philippe Weinzaepfel, C\'esar De Souza, Noe Pion, Gabriela Csurka, Yohann Cabon, Martin Humenberger• 2019

Related benchmarks

TaskDatasetResultRank
Relative Pose EstimationMegaDepth (test)
Pose AUC @5°37.84
83
Homography EstimationHPatches
Overall Accuracy (< 1px)40
59
Pose EstimationKITTI odometry
AUC583.37
51
Visual Localization7Scenes (test)
Chess Median Angular Error (°)0.88
41
Image MatchingKinect 2
Matching Score (MS)0.25
38
Image MatchingKinect 1
MS0.17
38
Image MatchingDeSurT (833 pairs total)
MS Score14
38
Image MatchingSimulation
MS6
38
Homography EstimationHPatches
AUC @3px50.6
35
Homography EstimationHPatches (viewpoint)
Accuracy (<1px)26.4
27
Showing 10 of 44 rows

Other info

Follow for update