Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Can We Scale Transformers to Predict Parameters of Diverse ImageNet Models?

About

Pretraining a neural network on a large dataset is becoming a cornerstone in machine learning that is within the reach of only a few communities with large-resources. We aim at an ambitious goal of democratizing pretraining. Towards that goal, we train and release a single neural network that can predict high quality ImageNet parameters of other neural networks. By using predicted parameters for initialization we are able to boost training of diverse ImageNet models available in PyTorch. When transferred to other datasets, models initialized with predicted parameters also converge faster and reach competitive final performance.

Boris Knyazev, Doha Hwang, Simon Lacoste-Julien• 2023

Related benchmarks

TaskDatasetResultRank
Image ClassificationImageNet-1K 1.0 (val)
Top-1 Accuracy49.1
1866
Image ClassificationImageNet-1K
Top-1 Acc52.7
836
Image ClassificationCIFAR-10
Accuracy93.9
507
Image ClassificationFood-101
Accuracy76.2
494
Image ClassificationStanford Cars
Accuracy30.6
477
Image ClassificationCUB-200 2011
Accuracy45.2
257
Image ClassificationDownstream Datasets Average
Average Accuracy61
57
Image ClassificationiNaturalist
Accuracy55.5
51
Showing 8 of 8 rows

Other info

Follow for update