Advanced Computing in the Age of AI | Saturday, September 23, 2023

Cray+Azure: Can Cloud Propel Supercomputing? 

Source: Shutterstock

Cray, which has been seeking new points of entry for its supercomputing technology into the cloud enterprise arena, has struck a partnership with Microsoft and its enormous Azure customer base, a deal that Cray believes has the potential to unlock broader growth for the company. The pair announced on Oct. 23 they are offering dedicated Cray XC and CS-Storm supercomputers inside the Azure platform allowing customers to run their HPC and AI applications alongside their other cloud workloads.

It’s "cloud without compromise," said Barry Bolding, Cray’s senior vice president and chief strategy officer. “We’ve seen a bifurcation in the marketplace. There are users who need the flexibility and elasticity of the cloud in order to meet some of their computational needs and data needs. They need those large repositories that the cloud can offer but there are a large class of users and applications for which you need a dedicated, tightly coupled architecture that doesn't compromise on the architecture in order to get the best performance which gets you the best TCO for the application.”

Cray supercomputers integrate with elements in the Azure cloud offering that would be used to execute HPC workloads, elements such as Azure virtual machines, data lake storage, the Microsoft AI platform and Azure machine learning services, according to Microsoft, with the intent to promote rich workflows and collaboration. Customers can also employ CycleCloud for hybrid HPC management.

The arrangement broadens Cray's market vistas by making supercomputing accessible to customers who don’t have their own datacenter and have been running HPC, AI, and advanced analytics workloads in the public cloud. Just as the cloud has been an on-ramp to HPC for many users that were too small or otherwise not in a position to own or manage an on-prem system, now there exists a generation of users who, over the past five to 10 years, came of age in the cloud and grew their requirements for scalable performance, such that public cloud is positioned as an on-ramp to dedicated supercomputing. This thesis is at the heart of the Microsoft-Cray partnership and the companies say it's born out by their market studies.

“Utilization rates have been growing on the Azure side to the point where customers need a dedicated resource and with this partnership, [Microsoft] can point their customers to Cray and we can do this all within the Azure datacenter for them,” said Bolding.

While there is some interest in this offering from the public sector, most is coming from the private sector, where companies don’t necessarily have the floor space and expertise to operate supercomputers themselves. “Those are a lot of commercial customers. We’ve seen them in the geospatial arena, we’ve seen them in the automotive space, we’ve seen them in financial services, in the life sciences. These are all segments that have been having these discussions with Microsoft,” Bolding noted.

Cray has announced some other cloudy plays in the recent past, notably with Deloitte and the Markley Group. The company says those partnerships are driving revenue, but Microsoft with its enormous Azure customer base has the potential to unlock far broader growth for Cray.

“They have a set of services that complements us,” said Bolding. “They’ve been doing HPC they just haven’t necessarily been doing the type of HPC we do, the supercomputing. They are the public cloud provider that has the most experience building services around HPC, which is why we chose to work with them, so I am really optimistic that we’ll be able to drive a whole new level of growth over what some of our other engagements are doing.”

Azure’s investment in high-performance computing and “big compute” includes its 2012 rollout of InfiniBand-backed instances, its purchase of GreenButton in 2014, its commitment to offering high-end GPUs and its recent acquisition of Cycle Computing.

Cray says it’s a pretty easy push to start delivering to customers. Of course since these are build-to-order systems, the usual three-four month fulfillment cycles apply. It’s not on-demand, project-based or pay-by-the-sip cloud, not yet.

“We want to focus on the best engagement model for both of us and that’s where we want to start,” said Bolding. “We want to provide Cray systems without compromise of performance. To do that at scale, we don’t want to go the full virtualization route today because that does limit some of the scalability of the storage and the compute, so I think we will be working with Microsoft to determine how we could do multi-tenant type projects with them, but we want to start with this in order to get the focus and really make sure we are providing the best experience to the customers, day one.”

The Cray XC with the Aries interconnect and the CS-Storm are the first two systems that are being offered under this partnership, along with the accompanying Cray ClusterStor storage systems. Customers can also leverage the Cray Urika-XC analytics software suite. The Cray systems will be available for customer-specific provisioning in select Microsoft Azure datacenters. Cray will contract directly with customers to provide support and maintenance.

About the author: Tiffany Trader

With over a decade’s experience covering the HPC space, Tiffany Trader is one of the preeminent voices reporting on advanced scale computing today.