DISK: Learning local features with policy gradient

About

Local feature frameworks are difficult to learn in an end-to-end fashion, due to the discreteness inherent to the selection and matching of sparse keypoints. We introduce DISK (DIScrete Keypoints), a novel method that overcomes these obstacles by leveraging principles from Reinforcement Learning (RL), optimizing end-to-end for a high number of correct feature matches. Our simple yet expressive probabilistic model lets us keep the training and inference regimes close, while maintaining good enough convergence properties to reliably train from scratch. Our features can be extracted very densely while remaining discriminative, challenging commonly held assumptions about what constitutes a good keypoint, as showcased in Fig. 1, and deliver state-of-the-art results on three public benchmarks.

Micha{\l} J. Tyszkiewicz, Pascal Fua, Eduard Trulls• 2020

Related benchmarks

Task	Dataset	Result
Relative Pose Estimation	MegaDepth 1500	AUC @ 20°75.42	163
Semantic Correspondence	PF-WILLOW	PCK@0.1 (bbox)17	109
Relative Pose Estimation	MegaDepth (test)	Pose AUC @5°45.31	83
Homography Estimation	HPatches	Overall Accuracy (< 1px)51.3	81
Visual Localization	Aachen Day-Night v1.1 (Day)	SR (0.25m, 2°)87.3	70
Visual Localization	Aachen Day-Night v1.1 (Night)	Success Rate (0.25m, 2°)78	69
Homography Estimation	HPatches	AUC @3px52.3	55
Pose Estimation	KITTI odometry	AUC584.14	51
Visual Localization	Aachen Day-Night (day)	Recall @ (0.25m, 2°)86.9	43
Relative Pose Estimation	ScanNet 1500	AUC@5°6.4	39

Showing 10 of 65 rows

Other info

Code

Follow for update

@wizwand_team Discord