Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Efficient Neural Architecture Search via Parameter Sharing

About

We propose Efficient Neural Architecture Search (ENAS), a fast and inexpensive approach for automatic model design. In ENAS, a controller learns to discover neural network architectures by searching for an optimal subgraph within a large computational graph. The controller is trained with policy gradient to select a subgraph that maximizes the expected reward on the validation set. Meanwhile the model corresponding to the selected subgraph is trained to minimize a canonical cross entropy loss. Thanks to parameter sharing between child models, ENAS is fast: it delivers strong empirical performances using much fewer GPU-hours than all existing automatic model design approaches, and notably, 1000x less expensive than standard Neural Architecture Search. On the Penn Treebank dataset, ENAS discovers a novel architecture that achieves a test perplexity of 55.8, establishing a new state-of-the-art among all methods without post-training processing. On the CIFAR-10 dataset, ENAS designs novel architectures that achieve a test error of 2.89%, which is on par with NASNet (Zoph et al., 2018), whose test error is 2.65%.

Hieu Pham, Melody Y. Guan, Barret Zoph, Quoc V. Le, Jeff Dean• 2018

Related benchmarks

TaskDatasetResultRank
Image ClassificationCIFAR-100 (test)--
3518
Image ClassificationCIFAR-10 (test)
Accuracy97.11
3381
Language ModelingWikiText-2 (test)
PPL70.4
1541
Image ClassificationCIFAR-10 (test)--
906
Image ClassificationCIFAR-100--
622
Language ModelingPTB (test)
Perplexity58.6
471
Image ClassificationCIFAR-10--
471
Language ModelingPenn Treebank (test)
Perplexity55.8
411
Image ClassificationCIFAR-10 (val)--
329
Image ClassificationCIFAR10 (test)
Test Accuracy97.11
284
Showing 10 of 44 rows

Other info

Follow for update