Compact Bilinear Pooling
About
Bilinear models has been shown to achieve impressive performance on a wide range of visual tasks, such as semantic segmentation, fine grained recognition and face recognition. However, bilinear features are high dimensional, typically on the order of hundreds of thousands to a few million, which makes them impractical for subsequent analysis. We propose two compact bilinear representations with the same discriminative power as the full bilinear representation but with only a few thousand dimensions. Our compact representations allow back-propagation of classification errors enabling an end-to-end optimization of the visual recognition system. The compact bilinear representations are derived through a novel kernelized analysis of bilinear pooling which provide insights into the discriminative power of bilinear pooling, and a platform for further research in compact pooling methods. Experimentation illustrate the utility of the proposed representations for image classification and few-shot learning across several datasets.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Fine-grained Image Classification | CUB200 2011 (test) | Accuracy84.3 | 536 | |
| Image Classification | DTD | Accuracy84 | 487 | |
| Fine-grained Image Classification | Stanford Cars (test) | Accuracy91.2 | 348 | |
| Classification | Cars | Accuracy91.2 | 314 | |
| Fine-grained visual classification | FGVC-Aircraft (test) | Top-1 Acc84.1 | 287 | |
| Image Classification | FGVC-Aircraft (test) | -- | 231 | |
| Fine-grained Image Classification | CUB-200 2011 | Accuracy84 | 222 | |
| Fine-grained Visual Categorization | Stanford Cars (test) | Accuracy91.2 | 110 | |
| Image Classification | Birds | Accuracy84.3 | 48 | |
| Classification | Airplane | Accuracy84.1 | 47 |