Advanced Computing in the Age of AI | Thursday, April 25, 2024

MIT Researchers Look to Speed Neural Net Design 

via Shutterstock

The design of a neural network architecture remains a daunting problem, requiring human expertise and lots of computing resources. The soaring computational requirements of neural architecture search (NAS) algorithms used in developing neural network frameworks make it difficult to search architectures such as ImageNet.

While so-called “diffentiable” NAS can help reduce the cost of GPU computational demand, that approach still consumes much GPU memory. Massachusetts Institute of Technology researchers have therefore proposed a scheme dubbed “proxyless” NAS to reduce computational demand using methods such as initially training models on a smaller data set, then scaling the process.

In a paper published last month, the MIT team presented a “ProxylessNAS” approach they said “can directly learn the architectures for large-scale target tasks and target hardware platforms.”

Their scheme addresses heavy memory consumption associated with diffentiable NAS with the goal of reducing GPU memory consumption and hours of graphics processing required to a level equal to typical model training. At the same time, the approach would allow for searching and crunching large data sets.

NAS is increasingly being used to automate neural network architecture designs for deep learning tasks such as image recognition and language modeling. The problem is that conventional NAS algorithms are computing and memory hogs: Thousands of models must be trained to accomplish a specific task, the MIT researchers noted.

Their search approach focuses on identifying “building blocks [for] proxy tasks,” beginning with smaller data sets or learning using fewer blocks. The best blocks are then “stacked” and transferred for use with a larger target task.

Read the full story here at sister web site Datanami.

About the author: George Leopold

George Leopold has written about science and technology for more than 30 years, focusing on electronics and aerospace technology. He previously served as executive editor of Electronic Engineering Times. Leopold is the author of "Calculated Risk: The Supersonic Life and Times of Gus Grissom" (Purdue University Press, 2016).

EnterpriseAI