Neural Predictor for Neural Architecture Search

Abstract

Neural Architecture Search methods are effective but often use complex algorithms to come up with the best architecture. We propose an approach with three basic steps that is conceptually much simpler. First we train N random architectures to generate N (architecture, validation accuracy) pairs and use them to train a regression model that predicts accuracies for architectures. Next, we use this regression model to predict the validation accuracies of a large number of random architectures. Finally, we train the top-K predicted architectures and deploy the model with the best validation result. While this approach seems simple, it is more than 20 × as sample efficient as Regularized Evolution on the NASBench-101 benchmark. On ImageNet, it approaches the efficiency of more complex and restrictive approaches based on weight sharing such as ProxylessNAS while being fully (embarrassingly) parallelizable and friendly to hyper-parameter tuning.

DOI
10.1007/978-3-030-58526-6_39
Year