Learning Semantically Enhanced Feature for Fine-Grained Image Classification
About
We aim to provide a computationally cheap yet effective approach for fine-grained image classification (FGIC) in this letter. Unlike previous methods that rely on complex part localization modules, our approach learns fine-grained features by enhancing the semantics of sub-features of a global feature. Specifically, we first achieve the sub-feature semantic by arranging feature channels of a CNN into different groups through channel permutation. Meanwhile, to enhance the discriminability of sub-features, the groups are guided to be activated on object parts with strong discriminability by a weighted combination regularization. Our approach is parameter parsimonious and can be easily integrated into the backbone model as a plug-and-play module for end-to-end training with only image-level supervision. Experiments verified the effectiveness of our approach and validated its comparable performance to the state-of-the-art methods. Code is available at https://github.com/cswluo/SEF
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Fine-grained Image Classification | Stanford Cars (test) | Accuracy94 | 348 | |
| Fine-grained Image Classification | Stanford Cars | Accuracy94 | 206 | |
| Image Classification | Stanford Dogs (test) | Top-1 Acc88.8 | 85 | |
| Fine-grained Visual Categorization | CUB-Birds | Accuracy87.3 | 26 | |
| Fine-grained Image Classification | Stanford Dogs | Score4.8 | 18 |