Advanced Computing in the Age of AI | Tuesday, April 23, 2024

Bitfusion Acquisition Extends VMware Virtualization of GPUs 

VMware has championed the virtues of virtualization of the x86 architecture for more than 20 years, and now the company plans to extend its commitment to virtualizing new architectures used for training AI models. VMware has announced its intended acquisition of Bifusion, provider of a multi-cloud AI infrastructure “disaggregation platform” for GPUs and FPGAs – i.e., hybrid virtualization of accelerated compute.

VMware, the “digital workspace” company majority owned by Dell Technologies, said it will integrate Bitfusion into its vSphere platform and “deliver a cloud operational model to an emerging part of the data center as well as bridge the gap” between CPU-only and accelerated computing infrastructures. The goal, said Alex Wang, VMware’s VP of strategy and corporate development, is “helping customers to efficiently share GPU resources powering their AI-enabled apps — on-premises and in the cloud.”

The acquisition builds on Bitfusion’s GPU virtualization partnership with VMware announced in March 2018.

In a blog announcing the planned acquisition, Krish Prasad, VMware SVP/GM, Cloud Platform Business Unit, said that because hardware accelerators on-prem are typically deployed bare-metal, this “force(s) poor utilization, poor efficiencies and limit organizations from sharing, abstracting and automating the infrastructure.” The break in the traditional data center architecture brought on by GPUs exacerbates organizational silos and lack of agility, according to VMware. “The root cause is that GPU accelerated servers became siloed, stand-alone assets,” the company said. “GPU servers reduce the agility gained by VMware vSphere, as they are operated in separate IT ‘islands.’

Bitfusion’s software platform is designed to decouple specific physical resources from the servers they are attached to, said Prasad, enabling virtualization for sharing accelerated compute “among isolated GPU compute workloads — even allowing sharing to happen across the network.”

“For example, the platform can share GPUs in a virtualized infrastructure, as a pool of network-accessible resources, rather than isolated resources per server,” he said. “Additionally, the platform can be extended to support other accelerators like FPGAs and ASICs.” Bitfusion also supports VMware’s “any cloud, any app, any device” strategy, he said, “with its ability to work across AI frameworks, clouds, networks, and formats such as virtual machines and containers.”

Bitfusion client runs as a userspace application within a VM instance. On a GPU accelerated server, Bitfusion runs as a software layer, with the individual physical GPUs viewed as a pooled resource for VM consumption. Bitfusion allocates GPU resources and attaches them over the network. When the AI runtime code is completed, Bitfusion releases shared GPU resources back into the resource pool.

“Multi-vendor hardware accelerators and the ecosystem around them are key components for delivering modern applications,” said Prasad. “These accelerators can be used regardless of location in the environment – on-premises and/or in the cloud.”

The planned Bitfusion acquisition follows VMware’s announcement last November of its intent to acquire Heptio, which helps organizations deploy Kubernetes, and Bitnami, a provider of application packaging solutions that helps developers deploy open and closed source software.

EnterpriseAI