Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Is Second-order Information Helpful for Large-scale Visual Recognition?

About

By stacking layers of convolution and nonlinearity, convolutional networks (ConvNets) effectively learn from low-level to high-level features and discriminative representations. Since the end goal of large-scale recognition is to delineate complex boundaries of thousands of classes, adequate exploration of feature distributions is important for realizing full potentials of ConvNets. However, state-of-the-art works concentrate only on deeper or wider architecture design, while rarely exploring feature statistics higher than first-order. We take a step towards addressing this problem. Our method consists in covariance pooling, instead of the most commonly used first-order pooling, of high-level convolutional features. The main challenges involved are robust covariance estimation given a small sample of large-dimensional features and usage of the manifold structure of covariance matrices. To address these challenges, we present a Matrix Power Normalized Covariance (MPN-COV) method. We develop forward and backward propagation formulas regarding the nonlinear matrix functions such that MPN-COV can be trained end-to-end. In addition, we analyze both qualitatively and quantitatively its advantage over the well-known Log-Euclidean metric. On the ImageNet 2012 validation set, by combining MPN-COV we achieve over 4%, 3% and 2.5% gains for AlexNet, VGG-M and VGG-16, respectively; integration of MPN-COV into 50-layer ResNet outperforms ResNet-101 and is comparable to ResNet-152. The source code will be available on the project page: http://www.peihuali.org/MPN-COV

Peihua Li, Jiangtao Xie, Qilong Wang, Wangmeng Zuo• 2017

Related benchmarks

TaskDatasetResultRank
Image ClassificationImageNet-1k (val)--
1453
Fine-grained Image ClassificationCUB200 2011 (test)
Accuracy86.3
536
Image ClassificationImageNet (val)
Top-1 Accuracy77.07
354
Fine-grained Image ClassificationStanford Cars (test)
Accuracy92.9
348
Fine-grained visual classificationFGVC-Aircraft (test)
Top-1 Acc90.8
287
Image ClassificationCUB-200 2011
Accuracy88.7
257
Image ClassificationImageNet (val)
Top-1 Error38.51
72
Fine-grained Image ClassificationBirds 1.0 (val)
Accuracy87.3
24
Fine-grained Image ClassificationAircrafts 1.0 (val)
Accuracy92.4
24
Fine-grained Image ClassificationCars 1.0 (val)
Accuracy93.4
23
Showing 10 of 13 rows

Other info

Follow for update