Stronger NAS with Weaker Predictors
About
Neural Architecture Search (NAS) often trains and evaluates a large number of architectures. Recent predictor-based NAS approaches attempt to alleviate such heavy computation costs with two key steps: sampling some architecture-performance pairs and fitting a proxy accuracy predictor. Given limited samples, these predictors, however, are far from accurate to locate top architectures due to the difficulty of fitting the huge search space. This paper reflects on a simple yet crucial question: if our final goal is to find the best architecture, do we really need to model the whole space well?. We propose a paradigm shift from fitting the whole architecture space using one strong predictor, to progressively fitting a search path towards the high-performance sub-space through a set of weaker predictors. As a key property of the weak predictors, their probabilities of sampling better architectures keep increasing. Hence we only sample a few well-performed architectures guided by the previously learned predictor and estimate a new better weak predictor. This embarrassingly easy framework, dubbed WeakNAS, produces coarse-to-fine iteration to gradually refine the ranking of sampling space. Extensive experiments demonstrate that WeakNAS costs fewer samples to find top-performance architectures on NAS-Bench-101 and NAS-Bench-201. Compared to state-of-the-art (SOTA) predictor-based NAS methods, WeakNAS outperforms all with notable margins, e.g., requiring at least 7.5x less samples to find global optimal on NAS-Bench-101. WeakNAS can also absorb their ideas to boost performance more. Further, WeakNAS strikes the new SOTA result of 81.3% in the ImageNet MobileNet Search Space. The code is available at https://github.com/VITA-Group/WeakNAS.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Neural Architecture Search | NAS-Bench-101 1.0 (test) | Test Accuracy0.9418 | 22 | |
| Neural Architecture Search | NAS-Bench-101 CIFAR-10 (test) | Accuracy94.18 | 18 | |
| Neural Architecture Search (Performance Prediction) | NAS-Bench-201 (test) | Kendall's Tau0.49 | 18 | |
| Multi-task Neural Architecture Search | TransNAS-Bench-101 Macro level search space 1.0 | Cls O Acc47.4 | 14 | |
| Semantic segmentation | TransNAS-Bench-101 Micro level search space | mIoU25.9 | 13 | |
| Autoencoder | TransNAS-Bench-101 Micro level search space | SSIM56.9 | 13 | |
| Object Classification | TransNAS-Bench-101 Micro level search space | Accuracy45.66 | 13 | |
| Room Layout Reconstruction | TransNAS-Bench-101 Micro level search space | L2 Loss60.31 | 13 | |
| Scene Classification | TransNAS-Bench-101 Micro level search space | Accuracy54.78 | 13 | |
| Surface Normal Prediction | TransNAS-Bench-101 Micro level search space | SSIM57.21 | 13 |