DARTS: Differentiable Architecture Search
About
This paper addresses the scalability challenge of architecture search by formulating the task in a differentiable manner. Unlike conventional approaches of applying evolution or reinforcement learning over a discrete and non-differentiable search space, our method is based on the continuous relaxation of the architecture representation, allowing efficient search of the architecture using gradient descent. Extensive experiments on CIFAR-10, ImageNet, Penn Treebank and WikiText-2 show that our algorithm excels in discovering high-performance convolutional architectures for image classification and recurrent architectures for language modeling, while being orders of magnitude faster than state-of-the-art non-differentiable techniques. Our implementation has been made publicly available to facilitate further research on efficient architecture search algorithms.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Image Classification | CIFAR-100 (test) | Accuracy82.46 | 3518 | |
| Image Classification | CIFAR-10 (test) | Accuracy54.3 | 3381 | |
| Object Detection | COCO 2017 (val) | AP31.5 | 2454 | |
| Language Modeling | WikiText-2 (test) | PPL66.9 | 1541 | |
| Image Classification | ImageNet-1k (val) | Top-1 Accuracy73.3 | 1453 | |
| Person Re-Identification | Market1501 (test) | Rank-1 Accuracy94.8 | 1264 | |
| Image Classification | ImageNet (val) | Top-1 Acc74.9 | 1206 | |
| Image Classification | CIFAR-10 (test) | Accuracy95.88 | 906 | |
| Image Classification | ImageNet-1k (val) | Top-1 Accuracy73.3 | 840 | |
| Image Classification | ImageNet 1k (test) | Top-1 Accuracy73.3 | 798 |