Advanced Computing in the Age of AI | Thursday, May 23, 2024

Graphcore Unveils Its Latest and Largest Commercially-Available IPU Pods 

After launching its second-generation intelligence processing units (IPUs) in 2020, four years after emerging from stealth, Graphcore is now boosting its product line with its largest commercially-available IPU-based systems: the IPU-POD128 and the IPU-POD256.

The newly announced IPU-POD128 has 128 Graphcore GC200 IPUs across 32 Graphcore M2000 compute blades and includes 8.2TB of memory, while the truly massive IPU-POD256 – which looks to take up four full-size datacenter cabinets – contains 256 GC200s across 64 M2000s and 16TB of memory. These systems, respectively, deliver eight and 16 petaflops of FP32 compute performance.

The GC200 IPU. Image courtesy of Graphcore.

The Graphcore GC200 IPUs, which the company calls “the most complex processor ever made,” contain more than 59.4 billion transistors and 900MB of high-speed SRAM. Graphcore said in a statement that it intends for the new models to help make its IPUs “the worldwide standard for machine intelligence compute.”

The GC200s are generally loaded into M2000 compute blades, also from Graphcore, with four chips and about a petaflop of “machine intelligence” compute power per blade. These blades, in turn, are loaded into Graphcore’s pods. The previously available IPU-POD16, for instance, contains 16 GC200 IPUs across four M2000 blades, alongside a terabyte of memory. The IPU-POD64, meanwhile, contains 64 GC200s across 16 M2000s and 4.1TB of memory.

Graphcore says that these systems are targeted at cloud hyperscalers, national scientific computing labs and companies with large AI teams in markets like financial services or pharmaceuticals due to their agile handling of taxing tasks like training large Transformer-based language models or conducting commercial-scale AI inferencing. Graphcore highlighted strong scaling results across its pods on BERT, with 88 percent scaling efficiency when moving from the IPU-POD16 to the IPU-POD64 and 97 percent scaling efficiency when moving from the IPU-POD64 to the IPU-POD128. Using ResNet50, moving from the IPU-POD128 to the IPU-POD256 showed 95 percent scaling efficiency.

The IPU-POD256. Image courtesy of Graphcore. (An IPU-POD128 takes up two cabinets, rather than four.)

The IPU-POD128 and IPU-POD256 are shipping to customers now from Atos and other Graphcore partners, but can also be accessed through Graphcore’s “Graphcloud” cloud service.

“We are enthusiastic to add IPU-POD128 and IPU-POD256 systems from Graphcore into our Atos ThinkAI portfolio,” said Agnès Boudot, senior vice president and head of HPC and quantum at Atos, adding that the systems would help Atos to “accelerate [its] customers’ capabilities to explore and deploy larger and more innovative AI models across many sectors, including academic research, finance, healthcare, telecoms and consumer internet.”

This story was originally published on sister website HPCwire.