Channel Interaction Networks for Fine-Grained Image Categorization
About
Fine-grained image categorization is challenging due to the subtle inter-class differences.We posit that exploiting the rich relationships between channels can help capture such differences since different channels correspond to different semantics. In this paper, we propose a channel interaction network (CIN), which models the channel-wise interplay both within an image and across images. For a single image, a self-channel interaction (SCI) module is proposed to explore channel-wise correlation within the image. This allows the model to learn the complementary features from the correlated channels, yielding stronger fine-grained features. Furthermore, given an image pair, we introduce a contrastive channel interaction (CCI) module to model the cross-sample channel interaction with a metric learning framework, allowing the CIN to distinguish the subtle visual differences between images. Our model can be trained efficiently in an end-to-end fashion without the need of multi-stage training and testing. Finally, comprehensive experiments are conducted on three publicly available benchmarks, where the proposed method consistently outperforms the state-of-theart approaches, such as DFL-CNN (Wang, Morariu, and Davis 2018) and NTS (Yang et al. 2018).
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Fine-grained Image Classification | CUB200 2011 (test) | Accuracy88.1 | 536 | |
| Fine-grained Image Classification | Stanford Cars (test) | Accuracy94.5 | 348 | |
| Image Classification | Stanford Cars (test) | Accuracy94.5 | 306 | |
| Fine-grained visual classification | FGVC-Aircraft (test) | Top-1 Acc92.8 | 287 | |
| Image Classification | CUB-200-2011 (test) | Top-1 Acc88.1 | 276 | |
| Image Classification | FGVC-Aircraft (test) | Accuracy92.8 | 231 | |
| Fine-grained Image Classification | CUB-200 2011 | Accuracy88.1 | 222 | |
| Fine-grained Image Classification | Stanford Cars | Accuracy94.5 | 206 | |
| Fine-grained Visual Categorization | Stanford Cars (test) | Accuracy94.5 | 110 | |
| Fine-grained visual classification | FGVC Aircraft | Top-1 Accuracy92.8 | 41 |