Advanced Computing in the Age of AI | Friday, March 29, 2024

Nvidia Expands Its Certified Server Models, Unveils DGX SuperPod Subscriptions at Computex 2021 

Nvidia is busy this week at the virtual Computex 2021 Taipei technology show, announcing an expansion of its nascent Nvidia-certified server program, a range of new Nvidia BlueField DPU-equipped server models and the coming availability of its Base Command Platform which will include a subscription option for its DGX SuperPods so customers can give them a try.

Under its expanded certified server program, which was initially unveiled in April at Nvidia’s own GTC21 conference, dozens of new servers are being certified to run the full suite of Nvidia AI enterprise software, giving customers more options for demanding workloads in traditional data centers or in hybrid cloud infrastructures.

Also announced were more new servers from partners using the company’s latest BlueField-2 data processing units, including from ASUS, Dell Technologies, GIGABYTE, QCT and Supermicro.

The Nvidia announcements also included the news that the Nvidia Base Command Platform, which is available presently only to early access customers after being unveiled at GTC21 in April, will be offered jointly with NetApp as a premium monthly subscription with Nvidia DGX SuperPod AI supercomputers and NetApp data management services.

Manuvir Das of Nvidia

The new products are part of the company’s ongoing democratization of AI, Manuvir Das, Nvidia’s head of enterprise computing, said during a May 27 briefing with reporters on the news.

“The work we are doing with the ecosystem is really to get it ready now to fully participate in this coming wave of the democratization of AI, where AI is utilized by every company on the planet rather than just the early adopters,” said Das. “That's really the theme of what we've talked about at Computex.”

That democratization includes taking Nvidia’s software tools, libraries, frameworks and other pieces that the company has built and putting it all into what it is calling Nvidia AI enterprise software, said Das.

Servers Certified to Run Nvidia AI Enterprise Software

That strategy is what is behind the company’s news that it is certifying its enterprise AI software suite on the latest wave of servers from partners including ASUS, Advantech, Altos, ASRock Rack, ASUS, Dell Technologies, GIGABYTE, Hewlett Packard Enterprise, Lenovo, QCT and Supermicro. Presently the number of certified servers includes more than 50. The certified server program is aimed at helping customers in industries such as healthcare, manufacturing, retail and financial services find the mainstream servers they require, according to the company.

The Nvidia systems include certifications for running VMware vSphere, Nvidia Omniverse Enterprise for design collaboration and advanced simulation and Red Hat OpenShift for AI development, as well as with Cloudera data engineering and machine learning.

The systems can be acquired in a wide range of price and performance levels and can be equipped with a wide range of Nvidia hardware, including A100, A40, A30 or A10 Tensor Core GPUs as well as BlueField-2 DPUs or ConnectX-6 adapters.

An earlier group of Nvidia certified servers were unveiled in April at GTC21.

Nvidia further said it would facilitate expanded access to Arm CPUs in 2022 through partnerships with GIGABYTE and Wiwynn. These companies plan to offer new servers featuring Arm Neoverse-based CPUs as well as Nvidia Ampere architecture GPUs or BlueField DPUs (or both), according to Nvidia. These systems will be submitted for Nvidia certification when they come to market.

New BlueField-2-Equipped DPU Servers

With this new round of DPU-2-equipped servers, Nvidia is expanding the line to give customers more options to find just the right servers for their needs, according to the company. The servers are aimed at workloads including software-defined networking, software-defined storage or traditional enterprise applications, which can benefit from the DPU’s ability to accelerate, offload and isolate infrastructure workloads for networking, security and storage, according to Nvidia. The DPU-equipped servers can also benefit systems running VMware vSphere, Windows or hyperconverged infrastructure solutions for AI and machine learning applications, graphics-intensive workloads or traditional business applications.

Nvidia BlueField DPU-2. Image courtesy: Nvidia

Nvidia’s BlueField DPUs – which essentially function as advanced SmartNICs – are designed to shift infrastructure tasks from the CPU to the DPU, which makes more server CPU cores available to run applications and increases server and data center efficiency, the company states.

The BlueField-2 DPU-accelerated servers are expected this year.

Nvidia Base Command and SuperPod Subscriptions

For customers, the idea behind Nvidia’s Base Command Platform and its related DGX SuperPod subscription option is that it can help companies move their AI projects more quickly from prototypes to production.

The Base Command software platform, which is designed for large-scale, multi-user and multi-team AI development workflows hosted on-premises or in the cloud, enables researchers and data scientists to simultaneously work on accelerated computing resources, according to Nvidia.

The cloud-hosted Base Command Platform will be offered in conjunction with NetApp, including an option to try out a DGX SuperPod on a subscription basis, said Das. Also included is NetApp all-flash storage. More information about these options will be released later this week, according to Nvidia.

Nvidia Base Command Platform management screens. Image courtesy: Nvidia

The Base Command Platform works with DGX systems and other Nvidia accelerated computing platforms, such as those offered by its cloud service provider partners. Many of the features of Base Command were unveiled by the company at GTC21. Base Command Manager is used to manage resources on an on-premises DGX SuperPod. Base Command Platform provides a wide range of controls to manage workflows from anywhere and makes it possible to offer the hosted subscription service with NetApp.

Das said the upcoming subscriptions mark the first time for DGX SuperPods to be offered this way. The move came after subscription options were requested by customers. “All of the gear is hosted by Nvidia in Equinix data centers,” he said. “And customers can come into this environment and rent access to a SuperPod or to a smaller part of the SuperPod, and they can rent it for just months at a time.”

For customers, this new option can provide a simple, easy to use experience for AI, said Das.

“What we're doing here is we're really lowering the barrier to entry to experience this best of breed system and equipment, and democratizing in that way,” he said. The expectation is that once customers try out the SuperPods that they will buy their own and use them more widely, he added.

Also announced were plans for Google Cloud's marketplace to add support for Base Command Platform later this year to give its customers access to the additional services.

“This hybrid AI offering will allow enterprises to write once and run anywhere with flexible access to multiple Nvidia A100 Tensor Core GPUs, speeding AI development for enterprises that leverage on-demand accelerated computing,” Manish Sainani, director of product management for machine learning Infrastructure at Google Cloud, said in a statement.

Amazon Web Services (AWS) also has plans to integrate services with the Base Command Platform, providing the ability for Nvidia customers to deploy their workloads from Base Command directly to Amazon Sagemaker using GPU cloud instances.

So far, the Nvidia Base Command Platform with NetApp is only available to early access customers. Monthly subscription pricing starts at $90,000.

Analysts On Nvidia’s Latest News

So, what do industry analysts think about Nvidia’s Computex announcements?

Karl Freund, analyst

“Nvidia is clearly climbing up the value chain, from chips to systems to software and eventually data centers,” Karl Freund, founder and principal analyst of Cambrian AI Research, told EnterpriseAI. “The announcements will appeal to enterprises that are starting out on their AI journeys, with a pretty vast array of software to develop, manage, and collaborate on AI applications.”

And while starting out on a cloud instance of a DGX SuperPod at $90,000 a month may seem rich, it does provide an easy on-ramp for customers, with no hardware to buy and install and no additional software needed, he said.

“Taking out the hassles will help enterprises get started in AI,” said Freund. “When ready for production, these Base Command clients can buy DGX systems, systems from their server vendor, or deploy on public clouds, all with the same software.”

Another analyst, James Kobielus, the senior research director for data communications and management at research, training, and data analytics consultancy TDWI, said he is impressed by Nvidia’s focus on helping customers productionize the full range of its AI software.

James Kobielus, analyst

“Most noteworthy is the Base Command Platform, which offers cloud-based access for AI development teams to Nvidia's most powerful DGX SuperPod AI supercomputer, along with NetApp's data management suite,” said Kobielus. “Once this offering is available in Google Cloud marketplace later in the year, I expect that many enterprises will shortlist Nvidia Base Command Platform for their development of machine learning apps to be deployed into hybrid cloud environments and run various Nvidia-certified systems from Nvidia partners in support of high-performance enterprise apps.”

Bob Sorensen, an analyst with Hyperion Research, told EnterpriseAI that Nvidia’s DPU-equipped servers provide opportunities for HPC server suppliers to develop new capabilities for intelligent and targeted compute capabilities right where they are needed by customers.

Bob Sorensen, analyst

“The added benefit is that these devices can help offload data management responsibilities from the CPUs, freeing them up for more CPU-relevant tasks,” said Sorensen. “Indeed, one could argue that DPUs such as these could be the harbinger of a new form of HPC design based on composable computing, which seeks to break down and distribute discrete server functions across specific smart devices scattered throughout a traditional HPC architecture.”

Rob Enderle, analyst

Rob Enderle, principal analyst with Enderle Group, said that Nvidia appears to be setting up to make a significant push into enterprise servers. “Their DPU technology is mind-bending,” said Enderle. “It frees up significant CPU resources, which can then be applied to other projects. That is particularly ideal for cloud solutions where you need a massive amount of flexibility.”

The importance of this technology is notable, he said.

“This is just the beginning of what is expected to be the most significant effort to displace x86 server technology in over a decade,” said Enderle. “This initiative is only the start and coupled with their Arm HPC Developer Kit with Gigabyte, it anticipates an endgame where x86 becomes obsolete.”

 

 

EnterpriseAI