LLHA-Net: A Hierarchical Attention Network for Two-View Correspondence Learning

About

Establishing the correct correspondence of feature points is a fundamental task in computer vision. However, the presence of numerous outliers among the feature points can significantly affect the matching results, reducing the accuracy and robustness of the process. Furthermore, a challenge arises when dealing with a large proportion of outliers: how to ensure the extraction of high-quality information while reducing errors caused by negative samples. To address these issues, in this paper, we propose a novel method called Layer-by-Layer Hierarchical Attention Network, which enhances the precision of feature point matching in computer vision by addressing the issue of outliers. Our method incorporates stage fusion, hierarchical extraction, and an attention mechanism to improve the network's representation capability by emphasizing the rich semantic information of feature points. Specifically, we introduce a layer-by-layer channel fusion module, which preserves the feature semantic information from each stage and achieves overall fusion, thereby enhancing the representation capability of the feature points. Additionally, we design a hierarchical attention module that adaptively captures and fuses global perception and structural semantic information using an attention mechanism. Finally, we propose two architectures to extract and integrate features, thereby improving the adaptability of our network. We conduct experiments on two public datasets, namely YFCC100M and SUN3D, and the results demonstrate that our proposed method outperforms several state-of-the-art techniques in both outlier removal and camera pose estimation. Source code is available at http://www.linshuyuan.com.

Shuyuan Lin, Yu Guo, Xiao Chen, Yanjie Liang, Guobao Xiao, Feiran Huang• 2025

Related benchmarks

Task	Dataset	Result
Camera pose estimation	SUN3D (Known Scene)	mAP @ 5°24.09	58
Camera pose estimation	YFCC100M Known Scene	mAP @ 5°45.16	36
Camera pose estimation	YFCC100M (Unknown Scene)	mAP @ 5°57.05	36
Camera pose estimation	SUN3D Unknown Scene	mAP (5°)18.93	36
Outlier removal	YFCC100M Known Scene	Precision63.24	28
Outlier removal	SUN3D (Known Scene)	Precision55.53	28
Outlier removal	SUN3D Unknown Scene	Precision47.37	18
Outlier removal	YFCC100M (Unknown Scene)	Precision59.28	9

Showing 8 of 8 rows

Other info

Follow for update

@wizwand_team Discord