Advanced Computing in the Age of AI | Wednesday, June 19, 2024

Graphcore Introduces Next-Gen Intelligence Processing Unit for AI Workloads 

British hardware designer Graphcore, which emerged from stealth in 2016 to launch its first-generation Intelligence Processing Unit (IPU), has announced its next-generation IPU platform: the IPU-Machine M2000. With the new M2000, Graphcore promises “greater processing power, more memory and built-in scalability for handling extremely large machine intelligence workloads.” The platform – which is available to preorder today – will begin production shipment at the end of 2020.

The M2000 compute blade (which Graphcore describes as “plug-and-play”) delivers a petaflops of “machine intelligence” compute power thanks to four of Graphcore’s new Colossus Mk2 GC200 IPU processors – each containing 1,472 separate IPU-cores and more than 59.4 billion transistors in an architecture Graphcore is calling “the most complex processor ever made.” The GC200 also contains an “unprecedented” 900 MB of high-speed SRAM inside the processor, a three-fold speedup compared to Graphcore’s first-generation IPU.

The second-generation IPU machine. Image courtesy of Graphcore.

The system is supported by Graphcore’s Poplar software stack, allowing users to apply their preferred AI framework while Poplar assembles the compute graph and the necessary runtime programs. The second-generation system offers full backwards compatibility with Graphcore’s first-generation Mk1 IPU products – at an eight-fold speedup, of course.

Graphcore emphasized the scalability of the M2000, saying that the “slim” blade will allow customers to scale up datacenters to include up to 64,000 IPUs for a whopping 16 exaflops of AI-Float machine intelligence compute power. Configurations scaled beyond eight of the M2000s use Graphcore’s rack-scale IPU-POD64, which contains 16 M2000s built into a 19-inch rack.

Source: Graphcore

For connectivity at this scale, Graphcore is using its new, low-latency IPU-Fabric technology, which it says “keeps communication latency close to constant while scaling from 10s of IPUs to 10s of thousands of IPUs.” Users will be able to choose their preferred mix of CPUs and IPUs (connected via Ethernet), and they will be able to dynamically provision those IPUs using Graphcore’s Virtual-IPU tool.

While full production shipments won’t begin until Q4, Graphcore touts a number of early customers, including Microsoft, the University of Oxford, Lawrence Berkeley National Laboratory, Atos and Simula Research Laboratory. 

“We are partnering with Graphcore to make their Mk2 IPU systems products, including IPU-Machine M2000 and IPU-POD scale out systems, available to our customers, specifically large European labs and institutions,” said Arnaud Bertrand, SVP, head of strategy and R&D for big data systems at Atos. “We are already planning with European early customers to build out an IPU cluster for their AI research projects.  The IPU new architecture can enable a more efficient way to run AI workloads which fits to the Atos decarbonization initiative and we are delighted to be working with a European AI semiconductor company to realize this future together.”

With this second salvo, Graphcore is aiming to disrupt Nvidia's market leadership in the increasingly competitive AI silicon market -- and they may have a good shot. "With this new product, Graphcore may now be first in line to challenge Nvidia for datacenter AI," said Karl Freund, senior analyst for AI at Moor Insights & Strategy, "at least for large-scale training."