Advanced Computing in the Age of AI | Tuesday, June 25, 2024

Inference Chip Vendor Untether AI Brings in $125M In Oversubscribed Round B Funding 

Untether AI, which specializes in AI inferencing chips for neural networks, cloud infrastructure and more, recently sought $100 million in new Round B funding for its operations and growth. Instead, there was so much interest from investors that the round was oversubscribed by 25 percent – instead bringing in $125 million, the company said in a July 20 announcement.

The $125 million in new funding will be used to continue to develop its next-generation of chip and products, hire additional engineering talent and to expand the company’s customer experience team, Arun Iyengar, Untether AI’s CEO, told EnterpriseAI. The company plans to expand its employee count from 70 today to about 150 in the next six to nine months.

Arun Iyengar, Untether AI CEO

“It is a good validation,” Iyengar said of the oversubscribed funding round. “We have had a lot of good success, and this is a good way for the market to look at it and say, ‘you know what, this is a technology that is here to stay.’ Having people come in and say they will give you money and give you a little bit more than what you are looking for is a positive way of saying that.”

Untether AI’s main products are its 16nm runAI200 inferencing chip and its tsunAImi PCI Express accelerator card, which is powered by four runAI200 chips, providing 2 PetaOps of INT8 performance in a single card. This compute power translates into over 80,000 frames per second of ResNet-50 v 1.5 throughput at batch=1, which is three times the throughput of other products on the market, according to Untether AI. For natural language processing, tsunAImi accelerator cards can process over 12,000 queries per second of BERT-base.

The company bills its products as “at-memory compute chips,” which locate the memory and the processing power right next to each other on the chips, said Iyengar. That makes the communications between the memory and chip faster, while having lower latency and lower power consumption.

The runAI200 chips are 16nm chips made by TSMC, including the chips on the tsunAImi cards. Both are being sold to customers, according to the company.

The chips and cards can be used in a variety of industries and applications, including banking and financial services; natural language processing; autonomous vehicles; smart city and retail; and other applications that require high-throughput and low-latency AI acceleration.

The latest Round B funding round for Toronto, Canada-based Tether AI was led by an affiliate of Tracker Capital Management, LLC and co-led by Intel Capital, Intel Corporation’s global investment organization. The round also included participation from new investor Canada Pension Plan Investment Board and existing investor Radical Ventures. The company has raised about $150 million since it was founded in 2018.

Iyengar said his company’s focus in the inference market is the infrastructure portion of it, including edge and cloud infrastructure. For AI’s heavy computational requirements and increasing power consumption in data centers, Untether AI’s at-memory compute architecture is aimed at breaking through computational bottlenecks for increased compute efficiency, according to the company.

The focus of the company’s runAI200 inferencing chips is its memory bank, which is made up of 385KBs of SRAM with a 2D array of 512 processing elements. With 511 banks per chip, each device offers 200MB of memory, enough to run many networks in a single chip. Customers can use the multi-chip partitioning capability in Tether AI’s imAIgine Software Development Kit to split larger networks apart so they can run on multiple devices, or even across multiple tsunAImi accelerator cards.

For inference acceleration uses, the runAI200 devices operate using integer data types and a batch mode of 1. The chips operate at up to 502 TOPS per second in their “sport” mode. They may also be configured for maximum efficiency, offering 8 TOPs per watt in “eco” mode, according to the vendor.

Analysts on Untether AI in the Marketplace

Linley Gwennap, analyst

Linley Gwennap, principal analyst with The Linley Group, told EnterpriseAI that Untether AI’s products “are showing advantages in terms of performance per watt and overall performance against Nvidia and against some of the other startups in the market.”

In the last few months, investors have been piling money into some of the larger names in the segment, including Graphcore and SambaNova Systems, but they are also “turning to some of these smaller, newer players like Untether AI and Tenstorrent that seem to be delivering what some of these the other guys are just promising,” said Gwennap.

On April 13 SambaNova Systems announced that it is getting another $676 million in new funding, which it says it will use to directly take on AI market leader Nvidia. With the latest large cash infusion, SambaNova says it now has total funding of more than $1 billion and a valuation above $5 billion.

AI and ML accelerator startup, Groq Inc., announced on April 14 the closing of a $300 million Series C fundraising round that had been rumored for more than a month. Co-led by Tiger Global Management and D1 Capital, with participation from The Spruce House Partnership and Addition, the cash brings Groq's total funding to $367 million, of which $300 million has been raised since the second-half of 2020, according to the company.

“To go into the data center, which is where Untether AI is going, you do need a lot of money, so this funding round is very important for them to be able to compete with some of these bigger, well-funded startups, as well as companies like Nvidia,” said Gwennap. “On the technology side, they do have a different twist on things by moving the AI computation next to the memory. They talked about at-memory computing, so their compute is right next to the memory, instead of having to shift data across the chip and across the board from the memory to some separate GPU or other compute device. They can get both enormous bandwidth and reduced power consumption by not having to send the data so far.”

Other companies have worked to solve these problems by throwing a bunch of compute units onto their chips, he said, but then they have found that the bottleneck just moved from the compute units to the memory side of the data.

“This is something that Untethered AI has been working on for a few years now, so I think they have kind of an advantage getting this at-memory architecture to market, versus some of these other companies that are just realizing now that hey you know we forgot to do the memory.”

Karl Freund, analyst

Another analyst, Karl Freund, principal at Cambrian AI Research, emphasized that the company's inference platform has excellent 500 TOPS performance. “That is incredible and should position them well if their software stack can deliver on the promise,” he said.

“It is becoming the norm that chip startups can develop a fast chip,” said Freund. “That is necessary, but insufficient for success. Development environments with a rich and open set of libraries that can squeeze out the performance for different models is hard. I look forward to learning more about Untethered AI’s software. This company is one to watch.”