Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

iiANET: Inception Inspired Attention Hybrid Network for efficient Long-Range Dependency

About

The recent emergence of hybrid models has introduced a transformative approach to computer vision, gradually moving beyond conventional convolutional neural networks and vision transformers. However, efficiently combining these two approaches to better capture long-range dependencies in complex images remains a challenge. In this paper, we present iiANET (Inception Inspired Attention Network), an efficient hybrid visual backbone designed to improve the modeling of long-range dependencies in complex visual recognition tasks. The core innovation of iiANET is the iiABlock, a unified building block that integrates a modified global r-MHSA (Multi-Head Self-Attention) and convolutional layers in parallel. This design enables iiABlock to simultaneously capture global context and local details, making it effective for extracting rich and diverse features. By efficiently fusing these complementary representations, iiABlock allows iiANET to achieve strong feature interaction while maintaining computational efficiency. Extensive qualitative and quantitative evaluations on some SOTA benchmarks demonstrate improved performance.

Haruna Yunusa, Adamu Lawan, Abdulganiyu Abdu Yusuf• 2024

Related benchmarks

TaskDatasetResultRank
Object DetectionCOCO 2017 (val)--
2643
Semantic segmentationADE20K
mIoU49.2
366
Instance SegmentationCOCO
APmask42.1
291
Object DetectionCOCO
AP50 (Box)68.3
237
Image ClassificationAID (test)
Overall Accuracy83.11
223
Image ClassificationImageNet 1k (test)
Top-1 Accuracy84.9
18
Image ClassificationOxford-III (test)
Top-1 Accuracy76.23
17
Object DetectionCOCO 2017 (test)--
10
Showing 8 of 8 rows

Other info

Follow for update