Advanced Computing in the Age of AI | Friday, March 29, 2024

Nvidia, Oracle Expand Cloud GPU Ties for AI, HPC 

Oracle is collaborating with Nvidia to bring the GPU leader’s unified AI and HPC platform to the public cloud for accelerating analytics and machine learning workloads.

The move makes Oracle the first public cloud vendor to support Nvidia’s HGX-2 platform, the partners said this week. The Oracle cloud will support both bare-metal and virtual machine instances of the HGX-2, which is designed to “multi-precision computing.”

That’s a reference to the ability to throttle the GPU for either precision or speed depending on the application. For example, the GPU accelerator offers greater precision for HPC workloads and speed for AI jobs. The Nvidia GPU also implements NVSwitch interconnect technology designed to reduce the data bottlenecks often encountered with HPC workloads such as simulations.

The partners said Wednesday (Oct. 10) their collaboration addresses growing demand for cloud-based GPU acceleration in data analytics as well as deep and machine learning applications. “Apart from enabling HPC and AI workloads, we’re targeting data science and analytics as a major area of investment,” Karan Batta, an Oracle product manager, noted in a blog post.

The expanded collaboration with Nvidia (NASDAQ: NVDA) for cloud-based GPU acceleration underscores Oracle’s strategy of moving from a database vendor to a player in the cutthroat public cloud market. Much of the emphasis has been new platform and infrastructure services along with automated capabilities such as “self-driving” cloud database service unveiled in August. Batta noted that these and other services target data science teams, enabling them to work collaboratively on big data projects.

Oracle (NYSE: ORCL) previously supported Nvidia’s Pascal GPU architecture cloud instances and, more recently, offered bare-metal instances of its Tesla V100 Tensor Core GPUs. Those chips were used to accelerate deep learning workloads.

Along with support for the HGX-2 platform, Oracle said its new cloud instances also would be backed by up to 48 cores of Intel Xeon processors. The combined instances will be available in early 2019, the cloud vendor said.

The collaboration is the latest effort by Oracle to differentiate its cloud offerings from public cloud leader Amazon Web Services (NASDAQ: AMZN), Microsoft Azure (NASDAQ: MSFT) and Google Cloud (NASDAQ: GOOGL). The partnership also reflects Nvidia’s push into high-end data analytics via its just-announced RAPIDS platform, which Oracle also is supporting.

RAPIDS is a suite of software libraries for running data science and analytics pipelines entirely on GPUs. The open source libraries are intended to leverage GPU parallel processing and high-band memory via the Python programing language. Among the uses is accelerating machine learning training while improving model accuracy.

RAPIDS will be available this week on Oracle’s cloud infrastructure via Nvidia’s GPU cloud service.

Oracle also announced support for Nvidia’s GPU cloud container registry that allows cloud users to deploy container applications and frameworks on Oracle’s GPU-accelerated cloud instances. Those instances are available in multiple regions in the U.S. and Europe.

About the author: George Leopold

George Leopold has written about science and technology for more than 30 years, focusing on electronics and aerospace technology. He previously served as executive editor of Electronic Engineering Times. Leopold is the author of "Calculated Risk: The Supersonic Life and Times of Gus Grissom" (Purdue University Press, 2016).

EnterpriseAI