Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

ResMLP: Feedforward networks for image classification with data-efficient training

About

We present ResMLP, an architecture built entirely upon multi-layer perceptrons for image classification. It is a simple residual network that alternates (i) a linear layer in which image patches interact, independently and identically across channels, and (ii) a two-layer feed-forward network in which channels interact independently per patch. When trained with a modern training strategy using heavy data-augmentation and optionally distillation, it attains surprisingly good accuracy/complexity trade-offs on ImageNet. We also train ResMLP models in a self-supervised setup, to further remove priors from employing a labelled dataset. Finally, by adapting our model to machine translation we achieve surprisingly good results. We share pre-trained models and our code based on the Timm library.

Hugo Touvron, Piotr Bojanowski, Mathilde Caron, Matthieu Cord, Alaaeldin El-Nouby, Edouard Grave, Gautier Izacard, Armand Joulin, Gabriel Synnaeve, Jakob Verbeek, Herv\'e J\'egou• 2021

Related benchmarks

TaskDatasetResultRank
Image ClassificationImageNet-1K 1.0 (val)
Top-1 Accuracy81
1866
Image ClassificationImageNet-1k (val)
Top-1 Accuracy81
1453
ClassificationImageNet-1K 1.0 (val)
Top-1 Accuracy (%)81
1155
Image ClassificationImageNet-1k (val)
Top-1 Accuracy81
840
Image ClassificationImageNet 1k (test)
Top-1 Accuracy81
798
Image ClassificationCIFAR-100
Top-1 Accuracy89.5
622
Image ClassificationImageNet-1K
Top-1 Acc79.4
524
Image ClassificationImageNet V2
Top-1 Acc65.5
487
Image ClassificationStanford Cars--
477
Image ClassificationCIFAR-10--
471
Showing 10 of 32 rows

Other info

Follow for update