Advanced Computing in the Age of AI | Sunday, September 24, 2023

IBM Unveils Power9-based AI Supercomputer 

AiMOS AI Supercomputer - source: RPI

Designed to push the frontiers of computing chip and systems performance optimized for AI workloads, an 8 petaflop IBM Power9-based supercomputer has been unveiled in upstate New York that will be used by IBM data and computer scientists, by academic researchers and by industrial and commercial end-users.

Installed at the Rensselaer Polytechnic Institute Center for Computational Innovations (CCI), the system  -- called AiMOS (Artificial Intelligence Multiprocessing Optimized System) – was the most powerful to debut on last month’s Top500 supercomputer ranking, it’s listed as the world’s 24th most powerful computer, the most powerful to be housed at a private university and – according to the Green500 listing – the third most energy efficient. It was built using the same IBM Power Systems technology as the Top500’s nos. 1 and 2, systems, the US Dept. of Energy’s IBM Summit and Sierra supercomputers, based on IBM Power9 CPUs and Nvidia GPUs.

AiMOS is the result of a collaboration between IBM, RPI and two New York state programs, Empire State Development (ESD) and NY CREATES. Named for Rensselaer co-founder Amos Eaton, AiMOS will serve as a test bed for computation, modeling and simulation of hardware “designed to push the boundaries of AI performance,” IBM said.

Prof. Chris Carothers of RPI

The machine will serve as a test bed for the IBM Research AI Hardware Center, which opened earlier this year on the SUNY Polytechnic Institute (SUNY Poly) campus in Albany. It is the third in a series of increasingly powerful IBM supercomputers at RPI's CCI, the first a 100 teraflop IBM Blue Gene installed in 2007, the second an IBM Blue Gene/Q petascale system installed six years ago, according to Christopher D. Carothers, director of the CCI and a professor in the institute’s Department of Computer Science, who told us the new system is 12x faster than the Blue Gene/Q.

He said AiMOS is comprised of 252 compute nodes with a total of 504 IBM Power nine processors, include 1,512 and Nvidia Volta GPUs. The system has 126 terabytes of system memory, more than 400 terabytes of high speed, local, solid state storage and a Mellanox network with 6 TB/second network bandwidth.

On the software side, Carothers said the new system offers an important advantage resulting from recent IBM M&A activity.

“One of the problems in the past with IBM systems is not all the software would run on them,” he said. “But (AiMOS) is going to run the world's most ubiquitous open source Linux based operating system, Red Hat, which is now owned by IBM. And so when we bring the data-centric architecture together with Red Hat, it's going to enable essentially the widest possible range of AI, machine learning and data analytics applications that are currently available. So essentially, very little open source software will not be able to execute, which is sort of a problem, I'd say with past supercomputer systems.”

The result, he said, is that AiMOS will complete neural network training jobs in minutes or hours that formerly required weeks or months. An anticipated impact of this capability, Carothers said, is to “move away from thinking about what we can do at a single focused ‘hero run,’ but instead think about … the whole ensemble of computations that work together in an integrated, cohesive manner. And this is going to enable even a much higher level of solving problems." This, he said, directly relates to exploration of new AI, machine learning and accelerator hardware design.

“So we want to really think about what are the algorithms doing? And are there pieces to these AI algorithms that we can really think about putting into hardware? Where AiMOS comes into play is we don't just have to make the hardware, we can begin to simulate it and emulate it on the test bed directly in advance of it actually being fabricated. And oh, by the way, once it's fab, it could be actually installed at our facility and we can then begin to test it on real research as well as other partner workloads that will be executing within the center.”

IBM said corporate members of its AI Hardware Center include Samsung, Applied Materials and Synopsys, as well as public entities, such as RPI and SUNY POLY and other members of the SUNY family.

“Computer artificial intelligence, or more appropriately, human augmented intelligence (AI), will help solve pressing problems, from healthcare to security to climate change,” said Dr. John E. Kelly III, IBM EVP. “In order to realize AI's full potential, special-purpose computing hardware is emerging as the next big opportunity. IBM is proud to have built the most powerful and smartest computers in the world today… Our collective goal is to make AI systems 1,000 times more efficient within the next decade.”