Advanced Computing in the Age of AI | Monday, June 24, 2024

Core42 Is Building Its 172 Million-core AI Supercomputer in Texas 

UAE-based Core42 is building an AI supercomputer with 172 million cores which will become operational later this year. 

The system, Condor Galaxy 3, was announced earlier this year and will have 192 nodes with Cerebras WSE-3 chips. The WSE-3 megachip is over 50 times larger than Nvidia’s GPUs and packs significantly more horsepower.

Cerebras WSE2 vs Nvidia GPU (Source Cerebras)

“Deployment of the system, also called CG-3, in Texas will start next month and be completed by September or October,” said Niall Ó Broin, senior director of product management for HPC at Core42, during a presentation at the ISC 2024 supercomputing conference in Hamburg, Germany.

Core42 is emerging as a significant player in datacenters and AI. Last month, Microsoft invested $1.5 billion in G42, Core42’s parent company, to expand the reach of its AI offerings and Azure cloud.

The CG-3 will have 192 nodes of CS-3 servers, which will host the megachips. Each WSE-3 chip has 900,000 cores and 4 trillion transistors, and it is made using the 5-nm process.

“The system will pack a tremendous amount of horsepower. It will have nine square meters of silicon once fully deployed,” Ó Broin said.

Each WSE-3 chip can reach 125 petaflops of peak AI performance. That adds up to 24 exaflops of system performance for the CS-3.

“As part of this chip, there will be four terabytes of on-chip memory. That’s not DRAM or HBM. It’s on-chip SRAM,” Ó Broin continued.

Talking about storage, Ó Broin mentioned, “The system will have 12 petabytes of mass storage. We have VAST across our HPC storage infrastructure.”

Core42 last year acquired Cerebras systems with 64 CS-2s, which it used to develop JAIS, a bilingual Arabic and English language model.

The CS-2 had the WSE-2 chip, which was made using the 7-nm process and had 850,000 cores.

“One thing we really like about Cerebras in these systems is we’ll be able to use our CS-2 code and bring that on to the CS-3 using PyTorch.”

G42 has one system on the Top500 list. Artemis, which has Intel’s 24-core Skylake server CPUs and Nvidia V100, entered the Top500 November 2020 list at 26 and is currently at 129. 

Another system, POD3, made by Huawei with Intel chips, exited the list in 2022.

Core42 was formed in 2023 from the merger of G42 Cloud and G42 Inception AI. The parent company, G42, was founded in 2018.

Core42 is also working with hardware from Nvidia and AMD. It is also working with OpenAI and other companies on AI models. 

G42 was being scrutinized by the U.S. government as being a conduit through which China would have access to the latest GPUs from Nvidia and other AI hardware from U.S. companies. 

Bloomberg reported last month that G42 reached a secret deal with the U.S. government to divest from China so the company continued to have access to Nvidia GPUs. 

G42 is a heavy user of Nvidia’s H100 GPUs. Microsoft invested $1.5 billion in G42 around the same time as the secret pact with the U.S. government.