Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

D2-Net: A Trainable CNN for Joint Detection and Description of Local Features

About

In this work we address the problem of finding reliable pixel-level correspondences under difficult imaging conditions. We propose an approach where a single convolutional neural network plays a dual role: It is simultaneously a dense feature descriptor and a feature detector. By postponing the detection to a later stage, the obtained keypoints are more stable than their traditional counterparts based on early detection of low-level structures. We show that this model can be trained using pixel correspondences extracted from readily available large-scale SfM reconstructions, without any further annotations. The proposed method obtains state-of-the-art performance on both the difficult Aachen Day-Night localization dataset and the InLoc indoor localization benchmark, as well as competitive performance on other benchmarks for image matching and 3D reconstruction.

Mihai Dusmanu, Ignacio Rocco, Tomas Pajdla, Marc Pollefeys, Josef Sivic, Akihiko Torii, Torsten Sattler• 2019

Related benchmarks

TaskDatasetResultRank
Homography EstimationHPatches
Overall Accuracy (< 1px)16.7
81
Visual LocalizationAachen Day-Night v1.1 (Day)
SR (0.25m, 2°)84.1
70
Homography EstimationHPatches
AUC @3px23.2
55
Image MatchingKinect 1
MS0.2
38
Image MatchingSimulation
MS11
38
Image MatchingKinect 2
Matching Score (MS)0.23
38
Image MatchingDeSurT (833 pairs total)
MS Score14
38
Pose EstimationMegaDepth 1500 (test)
AUC @ 5°35.4
38
Visual LocalizationRobotCar Seasons (night)
Recall (0.25m, 2°)20.4
35
Visual LocalizationExtended CMU Seasons Urban
Recall @ (0.25m, 2°)94
34
Showing 10 of 65 rows

Other info

Code

Follow for update