Statistically Motivated Second Order Pooling

About

Second-order pooling, a.k.a.~bilinear pooling, has proven effective for deep learning based visual recognition. However, the resulting second-order networks yield a final representation that is orders of magnitude larger than that of standard, first-order ones, making them memory-intensive and cumbersome to deploy. Here, we introduce a general, parametric compression strategy that can produce more compact representations than existing compression techniques, yet outperform both compressed and uncompressed second-order models. Our approach is motivated by a statistical analysis of the network's activations, relying on operations that lead to a Gaussian-distributed final representation, as inherently used by first-order deep networks. As evidenced by our experiments, this lets us outperform the state-of-the-art first-order and second-order models on several benchmark recognition datasets.

Kaicheng Yu, Mathieu Salzmann• 2018

Related benchmarks

Task	Dataset	Result
Fine-grained Image Classification	CUB200 2011 (test)	Accuracy85.77	567
Multi-view 3D Reconstruction	ShapeNetr2n2 (test)	mIoU68.2	160
Texture Classification	DTD	Accuracy72.51	131
Multi-view 3D Reconstruction	ModelNet40 (test)	mIoU52	112
Multi-view 3D Reconstruction	ShapeNet r2n2 13 categories (test)	mIoU68.4	80
Multi-view 3D Reconstruction	ShapeNet ism (test)	mIoU51	72
Classification	Airplane	Accuracy85.8	47
Image Classification	MIT Indoor	Accuracy79.7	40
Silhouette Prediction	Blobby dataset (test)	mIoU0.865	32
Silhouette Prediction	Blobby	mIoU86.5	32

Showing 10 of 13 rows

Other info

Follow for update

@wizwand_team Discord