Pairwise Confusion for Fine-Grained Visual Classification
About
Fine-Grained Visual Classification (FGVC) datasets contain small sample sizes, along with significant intra-class variation and inter-class similarity. While prior work has addressed intra-class variation using localization and segmentation techniques, inter-class similarity may also affect feature learning and reduce classification performance. In this work, we address this problem using a novel optimization procedure for the end-to-end neural network training on FGVC tasks. Our procedure, called Pairwise Confusion (PC) reduces overfitting by intentionally {introducing confusion} in the activations. With PC regularization, we obtain state-of-the-art performance on six of the most widely-used FGVC datasets and demonstrate improved localization ability. {PC} is easy to implement, does not need excessive hyperparameter tuning during training, and does not add significant overhead during test time.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Fine-grained Image Classification | CUB200 2011 (test) | Accuracy86.9 | 536 | |
| Fine-grained Image Classification | Stanford Cars (test) | Accuracy93.43 | 348 | |
| Image Classification | Stanford Cars (test) | Accuracy93.43 | 306 | |
| Fine-grained visual classification | FGVC-Aircraft (test) | Top-1 Acc89.8 | 287 | |
| Image Classification | FGVC-Aircraft (test) | Accuracy89.2 | 231 | |
| Fine-grained Image Classification | CUB-200 2011 | Accuracy87.7 | 222 | |
| Fine-grained Image Classification | Stanford Cars | Accuracy94.3 | 206 | |
| Fine-grained visual classification | NABirds (test) | Top-1 Accuracy82.8 | 157 | |
| Fine-grained Image Classification | Stanford Dogs (test) | Accuracy83.8 | 117 | |
| Fine-grained Visual Categorization | Stanford Cars (test) | Accuracy92.9 | 110 |