Neural Architecture Search without Training
About
The time and effort involved in hand-designing deep neural networks is immense. This has prompted the development of Neural Architecture Search (NAS) techniques to automate this design. However, NAS algorithms tend to be slow and expensive; they need to train vast numbers of candidate networks to inform the search process. This could be alleviated if we could partially predict a network's trained accuracy from its initial state. In this work, we examine the overlap of activations between datapoints in untrained networks and motivate how this can give a measure which is usefully indicative of a network's trained performance. We incorporate this measure into a simple algorithm that allows us to search for powerful networks without any training in a matter of seconds on a single GPU, and verify its effectiveness on NAS-Bench-101, NAS-Bench-201, NATS-Bench, and Network Design Spaces. Our approach can be readily combined with more expensive search methods; we examine a simple adaptation of regularised evolutionary search. Code for reproducing our experiments is available at https://github.com/BayesWatch/nas-without-training.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Image Classification | CIFAR-100 (test) | Accuracy77.5 | 3518 | |
| Image Classification | CIFAR-10 (test) | Accuracy96 | 3381 | |
| Image Classification | CIFAR-10 NAS-Bench-201 (test) | Accuracy92.96 | 173 | |
| Image Classification | CIFAR-100 NAS-Bench-201 (test) | Accuracy70.03 | 169 | |
| Image Classification | CIFAR-10 (test) | Test Error Rate6.37 | 151 | |
| Image Classification | ImageNet-16-120 NAS-Bench-201 (test) | Accuracy44.44 | 139 | |
| Image Classification | CIFAR-10 NAS-Bench-201 (val) | Accuracy91.2 | 119 | |
| Image Classification | CIFAR-100 NAS-Bench-201 (val) | Accuracy71.95 | 109 | |
| Image Classification | ImageNet 16-120 NAS-Bench-201 (val) | Accuracy45.7 | 96 | |
| Neural Architecture Search | NAS-Bench-201 ImageNet-16-120 (test) | -- | 86 |