ClusterGNN: Cluster-based Coarse-to-Fine Graph Neural Network for Efficient Feature Matching
About
Graph Neural Networks (GNNs) with attention have been successfully applied for learning visual feature matching. However, current methods learn with complete graphs, resulting in a quadratic complexity in the number of features. Motivated by a prior observation that self- and cross- attention matrices converge to a sparse representation, we propose ClusterGNN, an attentional GNN architecture which operates on clusters for learning the feature matching task. Using a progressive clustering module we adaptively divide keypoints into different subgraphs to reduce redundant connectivity, and employ a coarse-to-fine paradigm for mitigating miss-classification within images. Our approach yields a 59.7% reduction in runtime and 58.4% reduction in memory consumption for dense detection, compared to current state-of-the-art GNN-based matching, while achieving a competitive performance on various computer vision tasks.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Homography Estimation | HPatches | Overall Accuracy (< 1px)52 | 59 | |
| Pose Estimation | YFCC100M | AUC (5°)42.62 | 28 | |
| Relative Pose Estimation | YFCC100m v1.0 (test) | AUC @ 5°35.3 | 22 | |
| Visual Localization | Aachen Day-Night 1.0 (Night) | AUC @ (0.25m, 2°)85.7 | 18 | |
| Visual Localization | Aachen Day-Night 1.0 (Day) | AUC (0.25m, 2°)89.4 | 14 | |
| Indoor Localization | InLoc DUC1 v1.0 | Acc (0.25m, 10°)52.5 | 13 | |
| Indoor Localization | InLoc DUC2 v1.0 | SR (0.25m, 10°)55 | 13 | |
| Visual Localization | Aachen 1.0 (test) | Success Rate Night (0.25m, 2°)81.6 | 10 | |
| Outdoor visual localization | Aachen Day-Night (day) | Recall (0.25m, 2°)89.4 | 4 | |
| Outdoor visual localization | Aachen Day-Night (night) | Recall @ 0.25m, 2°81.6 | 4 |