MIT Researchers Look to Speed Neural Net Design
The design of a neural network architecture remains a daunting problem, requiring human expertise and lots of computing resources. The soaring computational requirements of neural architecture search (NAS) algorithms used in developing neural network frameworks make it difficult to search architectures such as ImageNet.
While so-called “diffentiable” NAS can help reduce the cost of GPU computational demand, that approach still consumes much GPU memory. Massachusetts Institute of Technology researchers have therefore proposed a scheme dubbed “proxyless” NAS to reduce computational demand using methods such as initially training models on a smaller data set, then scaling the process.
In a paper published last month, the MIT team presented a “ProxylessNAS” approach they said “can directly learn the architectures for large-scale target tasks and target hardware platforms.”
Their scheme addresses heavy memory consumption associated with diffentiable NAS with the goal of reducing GPU memory consumption and hours of graphics processing required to a level equal to typical model training. At the same time, the approach would allow for searching and crunching large data sets.
NAS is increasingly being used to automate neural network architecture designs for deep learning tasks such as image recognition and language modeling. The problem is that conventional NAS algorithms are computing and memory hogs: Thousands of models must be trained to accomplish a specific task, the MIT researchers noted.
Their search approach focuses on identifying “building blocks [for] proxy tasks,” beginning with smaller data sets or learning using fewer blocks. The best blocks are then “stacked” and transferred for use with a larger target task.
Read the full story here at sister web site Datanami.