Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

NAS-Bench-360: Benchmarking Neural Architecture Search on Diverse Tasks

About

Most existing neural architecture search (NAS) benchmarks and algorithms prioritize well-studied tasks, e.g. image classification on CIFAR or ImageNet. This makes the performance of NAS approaches in more diverse areas poorly understood. In this paper, we present NAS-Bench-360, a benchmark suite to evaluate methods on domains beyond those traditionally studied in architecture search, and use it to address the following question: do state-of-the-art NAS methods perform well on diverse tasks? To construct the benchmark, we curate ten tasks spanning a diverse array of application domains, dataset sizes, problem dimensionalities, and learning objectives. Each task is carefully chosen to interoperate with modern CNN-based search methods while possibly being far-afield from its original development domain. To speed up and reduce the cost of NAS research, for two of the tasks we release the precomputed performance of 15,625 architectures comprising a standard CNN search space. Experimentally, we show the need for more robust NAS evaluation of the kind NAS-Bench-360 enables by showing that several modern NAS procedures perform inconsistently across the ten tasks, with many catastrophically poor results. We also demonstrate how NAS-Bench-360 and its associated precomputed results will enable future scientific discoveries by testing whether several recent hypotheses promoted in the NAS literature hold on diverse tasks. NAS-Bench-360 is hosted at https://nb360.ml.cmu.edu.

Renbo Tu, Nicholas Roberts, Mikhail Khodak, Junhong Shen, Frederic Sala, Ameet Talwalkar• 2021

Related benchmarks

TaskDatasetResultRank
Audio TaggingFSD50K (eval)
mAP94
19
Genomic Sequence ClassificationDeepSEA
AUROC0.68
10
Neural Architecture SearchNAS-Bench-360 (test)
ECG 1-F1 Error0.28
10
Satellite Time-Series Classificationsatellite
0-1 Error Rate12.51
10
Cosmic Ray DetectionCosmic
1-AUROC0.49
9
Diverse Prediction TasksNAS-Bench-360 (test)
Darcy Score0.026
9
ECG ClassificationECG
1-F1 Score0.28
9
Protein Contact Map PredictionPSICOV
MAE82.94
9
PDE solvingDarcy Flow
Relative L2 Error0.8
9
Cross-modal adaptationNAS-Bench-360
Darcy (Relative L2)0.008
9
Showing 10 of 21 rows

Other info

Code

Follow for update