Advanced Computing in the Age of AI | Monday, April 29, 2024

Oracle Taps Nvidia to Grow Cloud Stack for AI and Data Science 

Oracle is bringing Nvidia’s AI Enterprise software suite alongside thousands of its latest GPUs to its cloud infrastructure, which could fuel the chipmaker’s plans to make billions from subscription services.

The partnership, which builds on earlier deployments, sets up Nvidia with the kind of infrastructure it requires to expand on a long-term goal to become a software powerhouse. It also gives Oracle’s cloud service the plug-and-play hardware capacity and software framework to easily deploy AI software.

Oracle and Nvidia have common expertise in areas that include healthcare, manufacturing, communications and financial services, and there’s a lot of opportunity to collaborate there, said Leo Leung, vice president at Oracle, during a press briefing.

Nvidia CEO Jensen Huang and Oracle CEO Safra A. Katz on stage at Oracle CloudWorld (Oct. 18, 2022)

The companies are “looking at the full stack so not just the GPUs and infrastructure but getting into the software layer, getting into the service layer,” Leung said.

Nvidia is known as a graphics chip company, but is betting its future on generating more revenue from software and services. The company is looking at a Netflix style subscription business model and charging customers when its software and hardware are used to create products.

Nvidia already sells AI services to vertical industries including automotive, healthcare and manufacturing. Nvidia’s software offerings include Clara for healthcare and Drive for autonomous cars, which are built on the CUDA parallel programming framework. The software is packaged into an offering called AI Enterprise.

The AI Enterprise offerings from Nvidia have so far been limited to a handful of virtual machine interfaces on Google Cloud, Microsoft Azure and Amazon Web Services, which have their own AI software offerings that are largely based on open-source tools. But Nvidia has found a full-stack partner in Oracle, which is willing to take on the graphics chip maker’s proprietary software stack for its cloud service.

“We see great opportunity for bringing the rich software to take advantage of the accelerated computing infrastructure and GPUs because it’s the full stack required to deliver what customers need and looking at AI,” said Pat Lee, head of strategic partnerships at Nvidia.

The Oracle Cloud offerings will include Nvidia’s Clara, which is a medical software framework that supports applications such as imaging and robotic surgery.

Oracle customers can currently get clusters of 512 GPUs, and is adding tens of thousands of GPU capacity, Leung said. The GPUs and AI Enterprise software stack will sit on top of the core Oracle Cloud infrastructure, which includes bare metal compute, storage and networking hardware.

Oracle is working with Nvidia around data flow to take advantage of their RAPIDS accelerator for GPUs to speed up analytics and database transactions. The Oracle cloud infrastructure has the Apache Spark service, which provides the acceleration for data integration, cataloging and asset tracking. RAPIDS fits in seamlessly without any code change, the executives said.

The Oracle infrastructure provides the ability to support a heterogeneous computing environment, with many storage, networking and memory options to scale out AI and high-performance computing.

“We’re able to provide bare metal instances, and so no virtualization and nothing getting in the way between the customer and what you’re trying to do with the infrastructure,” Leung said.

The announcement was the latest in many announcements of H100 deployments in the cloud. Nvidia has previously said the H100 will appear in cloud services next year.

But the press briefing raised more questions than answers about Nvidia’s H100 strategy, delays and cloud plans. Nvidia has previously said its H100 servers were in full production, but dodged many questions during the press briefing on licensing, configurations and pricing.

“We did announce that with H100 and (associated) OEM servers that it would include an Nvidia [AI Enterprise] license. But with regards to Oracle Cloud and our cloud offerings, we have a different pricing structure,” Lee said, without providing further details.

Nvidia’s aiming to provide software services as a subscription model, and the chipmaker declined to comment on whether it’ll get a cut from GPU instances on the Oracle Cloud.

Nvidia also declined to comment on whether the H100 installations on Oracle Cloud would be compatible with on-premise cloud and third-party cloud services.

“We’re not talking about specifics on a hybrid cloud offering but if they do leverage the power of the Nvidia AI tools, they can use both on premise and cloud resources to leverage the same hardware in terms of hybrid cloud management. That’s something we’re not discussing today,” Lee said.

EnterpriseAI