Statistically Motivated Second Order Pooling
About
Second-order pooling, a.k.a.~bilinear pooling, has proven effective for deep learning based visual recognition. However, the resulting second-order networks yield a final representation that is orders of magnitude larger than that of standard, first-order ones, making them memory-intensive and cumbersome to deploy. Here, we introduce a general, parametric compression strategy that can produce more compact representations than existing compression techniques, yet outperform both compressed and uncompressed second-order models. Our approach is motivated by a statistical analysis of the network's activations, relying on operations that lead to a Gaussian-distributed final representation, as inherently used by first-order deep networks. As evidenced by our experiments, this lets us outperform the state-of-the-art first-order and second-order models on several benchmark recognition datasets.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Fine-grained Image Classification | CUB200 2011 (test) | Accuracy85.77 | 536 | |
| Multi-view 3D Reconstruction | ShapeNetr2n2 (test) | mIoU68.2 | 160 | |
| Multi-view 3D Reconstruction | ModelNet40 (test) | mIoU52 | 112 | |
| Texture Classification | DTD | Accuracy72.51 | 108 | |
| Multi-view 3D Reconstruction | ShapeNet r2n2 13 categories (test) | mIoU68.4 | 80 | |
| Multi-view 3D Reconstruction | ShapeNet ism (test) | mIoU51 | 72 | |
| Classification | Airplane | Accuracy85.8 | 47 | |
| Image Classification | MIT Indoor | Accuracy79.7 | 35 | |
| Silhouette Prediction | Blobby dataset (test) | mIoU0.865 | 32 | |
| Silhouette Prediction | Blobby | mIoU86.5 | 32 |