Advanced Computing in the Age of AI | Tuesday, September 26, 2023

VMware Unveils New Generative AI Tools, Expands Nvidia Partnership 

VMware kicked off its Explore event in Las Vegas with a series of announcements geared toward enabling enterprise generative AI development.

VMware and Nvidia extended their partnership to unveil VMware Private AI Foundation with Nvidia, an offering that promises to provide enterprises with the software and compute to fine-tune large language models and run AI-enabled applications using proprietary data in VMware’s cloud infrastructure.

Building applications using public AI models is a no-go for many enterprises due to the risks of data exposure and unknown training data. In fact, a new survey released by AI engineering platform Predibase found that more than 75% of enterprises do not plan to use commercial LLMs in production due to data privacy concerns.

The answer lies in custom models trained with company data using a secure architecture. Companies need flexibility when developing applications using their own training data, and VMware touts its multi-cloud approach as a secure and resilient option for building customized AI models.

VMware Private AI Foundation with Nvidia is a set of integrated AI tools that allow enterprises to deploy AI models trained on private data in datacenters, public clouds, or the edge. VMware’s Private AI architecture is built on VMware’s Cloud Foundation and is integrated with Nvidia’s AI Enterprise software and compute infrastructure.

VMware CEO Raghu Raghuram says the potential of generative AI cannot be unlocked unless enterprises are able to maintain the privacy of their data and minimize IP risk while training, customizing, and serving their AI models, “With VMware Private AI, we are empowering our customers to tap into their trusted data so they can build and run AI models quickly and more securely in their multi-cloud environment.”

Nvidia CEO Jensen Huang and VMware CEO Raghu Raghuram announced the expanded partnership at VMware Explore.

Enterprises can choose where to build and run their models using a data-secure architecture. VMware and Nvidia claim AI workloads can scale across up to 16 GPUs in a single virtual machine and across multiple nodes, leading to lower overall costs and more efficiency. Additionally, VMware says its vSAN Express Storage Architecture will provide performance-optimized NVMe storage and supports GPUDirect storage over RDMA, allowing for direct I/O transfer from storage to GPUs without CPU involvement.

The new platform with VMware will feature Nvidia NeMo, the company’s AI framework (included in Nvidia AI Enterprise, the operating system of its AI platform) that combines customization frameworks, guardrail toolkits, data curation tools, and pretrained models. NeMo uses TensorRT for Large Language Models, a service that optimizes inference performance on Nvidia GPUs. VMware and Nvidia say enterprises can use the new Nvidia AI Workbench to pull community models, like Llama 2, available on Hugging Face, customize them remotely and deploy production-grade generative AI in VMware environments.

“Enterprises everywhere are racing to integrate generative AI into their businesses,” said Jensen Huang, founder and CEO of Nvidia. “Our expanded collaboration with VMware will offer hundreds of thousands of customers – across financial services, healthcare, manufacturing and more – the full-stack software and computing they need to unlock the potential of generative AI using custom applications built with their own data.”

Nvidia is not the only AI development game in town, as many are turning to open source solutions because they require the ability to use multiple open source tools and frameworks. For these open source AI projects, VMware also unveiled VMware Private AI Reference Architecture for Open Source, which integrates OSS technologies from VMware partners to deliver an open reference architecture for building and serving OSS models on top of VMware Cloud Foundation.

One such technology partnership is with Anyscale, developers of the widely adopted, open source unified compute framework Ray. Data scientists and ML engineers can scale AI and Python workloads using Ray on VMware’s Cloud Foundation by utilizing their current compute footprints for ML workloads instead of defaulting to the public cloud, VMware says.

A crowd gathers in the expo hall at VMware Explore in Las Vegas.

Anyscale CEO Robert Nishihara commented in a release that companies are struggling to stay at the forefront of AI while scaling, productizing, and iterating quickly.

“Because Ray can run anywhere – on any cloud provider, on-premises, on your laptop – and VMware’s customers run everywhere, it’s a natural collaboration to make it easier for companies to accelerate their business using generative AI,” he said.

“AI has traditionally been built and designed by data scientists, for data scientists,” said Chris Wolf, vice president of VMware AI Labs. “With the introduction of these new VMware Private AI offerings, VMware is making the future of AI serve everyone in the enterprise by bringing the choice of compute and AI models closer to the data. Our Private AI approach benefits enterprise use cases ranging from software development and marketing content generation to customer service tasks and pulling insights from legal documents.”

In addition to the new Private AI offerings, VMware also announced Intelligent Assist, a family of generative AI-based solutions trained on VMware data that will automate aspects of enterprise IT in multi-cloud environments. Intelligent Assist will be integrated into several VMware products including VMware Tanzu, which will address the challenges of multi-cloud visibility and configuration by allowing users to conversationally request and refine changes to their enterprise’s cloud infrastructure, the company says. Workspace ONE will also include it and will allow users to create high-quality scripts using natural language prompts. NSX+ is another service to be enhanced with these new generative AI capabilities that will help security analysts to determine the relevance of security alerts to more effectively remediate threats.