Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

The Unreasonable Effectiveness of Fully-Connected Layers for Low-Data Regimes

About

Convolutional neural networks were the standard for solving many computer vision tasks until recently, when Transformers of MLP-based architectures have started to show competitive performance. These architectures typically have a vast number of weights and need to be trained on massive datasets; hence, they are not suitable for their use in low-data regimes. In this work, we propose a simple yet effective framework to improve generalization from small amounts of data. We augment modern CNNs with fully-connected (FC) layers and show the massive impact this architectural change has in low-data regimes. We further present an online joint knowledge-distillation method to utilize the extra FC layers at train time but avoid them during test time. This allows us to improve the generalization of a CNN-based model without any increase in the number of weights at test time. We perform classification experiments for a large range of network backbones and several standard datasets on supervised learning and active learning. Our experiments significantly outperform the networks without fully-connected layers, reaching a relative improvement of up to $16\%$ validation accuracy in the supervised setting without adding any extra parameters during inference.

Peter Kocsis, Peter S\'uken\'ik, Guillem Bras\'o, Matthias Nie{\ss}ner, Laura Leal-Taix\'e, Ismail Elezi• 2022

Related benchmarks

TaskDatasetResultRank
Image ClassificationCIFAR100 (test)
Top-1 Accuracy61.7
377
Image ClassificationCIFAR10 (test)
Test Accuracy90.36
284
Image ClassificationCaltech-101
Accuracy95.86
198
Image ClassificationCaltech101 (test)
Accuracy95.9
121
Image ClassificationCIFAR10 (train)
Accuracy87.2
90
Image ClassificationCIFAR-100 (test)
Accuracy63.27
78
Image ClassificationCaltech-256 (test)
Top-1 Acc83.07
59
Image ClassificationCaltech-256
Accuracy81.85
36
Showing 8 of 8 rows

Other info

Code

Follow for update