Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Multi-Attention Multi-Class Constraint for Fine-grained Image Recognition

About

Attention-based learning for fine-grained image recognition remains a challenging task, where most of the existing methods treat each object part in isolation, while neglecting the correlations among them. In addition, the multi-stage or multi-scale mechanisms involved make the existing methods less efficient and hard to be trained end-to-end. In this paper, we propose a novel attention-based convolutional neural network (CNN) which regulates multiple object parts among different input images. Our method first learns multiple attention region features of each input image through the one-squeeze multi-excitation (OSME) module, and then apply the multi-attention multi-class constraint (MAMC) in a metric learning framework. For each anchor feature, the MAMC functions by pulling same-attention same-class features closer, while pushing different-attention or different-class features away. Our method can be easily trained end-to-end, and is highly efficient which requires only one training stage. Moreover, we introduce Dogs-in-the-Wild, a comprehensive dog species dataset that surpasses similar existing datasets by category coverage, data volume and annotation quality. This dataset will be released upon acceptance to facilitate the research of fine-grained image recognition. Extensive experiments are conducted to show the substantial improvements of our method on four benchmark datasets.

Ming Sun, Yuchen Yuan, Feng Zhou, Errui Ding• 2018

Related benchmarks

TaskDatasetResultRank
Fine-grained Image ClassificationCUB200 2011 (test)
Accuracy86.5
536
Fine-grained Image ClassificationStanford Cars (test)
Accuracy93
348
Image ClassificationStanford Cars (test)
Accuracy93
306
Image ClassificationCUB-200-2011 (test)
Top-1 Acc86.5
276
Fine-grained Image ClassificationCUB-200 2011
Accuracy86.5
222
Fine-grained Image ClassificationStanford Cars
Accuracy93
206
Fine-grained Image ClassificationStanford Dogs (test)
Accuracy85.2
117
Fine-grained Visual CategorizationStanford Cars (test)
Accuracy93
110
ClassificationCUB
Accuracy86.5
85
Image ClassificationStanford Dogs (test)
Top-1 Acc85.2
85
Showing 10 of 14 rows

Other info

Follow for update