Advanced Computing in the Age of AI | Friday, March 29, 2024

Inference Engine Uses SRAM to Edge AI Apps 

via Shutterstock

Flex Logix, the embedded FPGA specialist, has shifted gears by applying its proprietary interconnect technology to launch an inference engine that boosts neural inferencing capacity at the network edge while reducing DRAM bandwidth requirements.

Instead, the inferencing engine draws greater processing bandwidth from less expensive and lower-power SRAMs. That inference approach is also touted as a better way to load the neural weights used for deep learning.

Unlike current CPU, GPU and Tensor-based processors that use programmable software interconnects, the Flex Logix approach leverages its embedded FPGA architecture to provide faster programmable hardware interconnects that require lower memory bandwidth. That, the chip maker said, reduces DRAM bandwidth requirements—and fewer DRAMS translates to lower cost and less power for edge applications.

“We see the edge inferencing market as the biggest market over the next five years,” said Flex Logix CEO Geoff Tate. Among the early applications for the low-power inferencing approach are smart surveillance cameras and real-time object recognition, Tate added in an interview.

The company said this week its NMAX neural inferencing platform delivers up to 100 TOPS of peak performance using about one/tenth the “typical” DRAM bandwidth. The programmable interconnect technology is designed to address two key challenges for edge inferencing: reducing data movement and energy consumption.

Read the full story here at sister web site Datanami.

About the author: George Leopold

George Leopold has written about science and technology for more than 30 years, focusing on electronics and aerospace technology. He previously served as executive editor of Electronic Engineering Times. Leopold is the author of "Calculated Risk: The Supersonic Life and Times of Gus Grissom" (Purdue University Press, 2016).

EnterpriseAI