Advanced Computing in the Age of AI | Sunday, May 26, 2024

Users Line Up For VMware Virtual SAN Storage 

VMware was the first mover on virtualization for X86 systems and has benefitted mightily from the wave it started a decade ago to push up the utilization on servers and make systems more resilient and malleable. Storage and networking need to be virtualized, too, and VMware wants to do both. Sometime this week VMware will roll out the production version of its Virtual SAN, or vSAN for short, converged storage, and in doing so it will take on some upstarts in converged systems and also compete against parent company EMC and its rivals who sell real storage area networks based on hardware.

VMware has offered a somewhat limited Virtual Storage Array, or VSA, aimed at small installations for the past two years. This virtual storage array, which converted local disks on multiple hosts into something that looked like a storage area network to the hypervisors and virtual machines running on those machines, was limited to spanning three server nodes and did not pack the I/O punch that enterprise-class customers needed.

With vSAN, which went into beta testing last fall, VMware is pushing the scalability quite a bit harder and has created a virtual SAN that can replace actual SANs in many instances. This is part of a storage pooling effort for its hypervisor that also includes cloud-based storage and on-site NAS and SAN arrays.

That vSAN is more scalable than the VSA software is one reason why there were over 12,000 beta testers for vSAN. That is a very large beta program to manage, and it just goes to show the pent-up demand that VMware is looking to fulfill to generate its next several billions of dollars in sales.

vmware-sddc-storageOne of those early beta testers that is working feverishly to get vSAN into production by April 1 after extensive testing is Itrica, a software development company and service provider that has written a set of applications for managing clinical trials. The company is eight years old and is headquartered in Boston, Massachusetts with offices in Bedford, New Hampshire and Columbus, Ohio. The Clinical Trial Management System created by Itrica is used to link researchers at academic and corporate labs that are running trials to the pharmaceutical companies that make the drugs being tested and to the US Food and Drug Administration that monitors such trials and approves drugs for sale.

Itrica operates its service from SuperNAP datacenter outside of Las Vegas and the NAP of the Americas datacenter in Miami, Florida. (The latter is the flagship datacenter of Verizon's Terremark cloud division.) Itrica has another datacenter it has just opened up in Boston as it is expanding its business. The company has a variety of physical SANs installed at the moment, with about 1 PB of capacity in each facility. One of the challenges with clinical trials is that Federal regulations have stringent requirements for isolation between test, production, staging, and user environments, and this is one of the reasons why Itrica build a cloud to run its applications. The company chose the KVM hypervisor as its virtualization engine of choice several years ago, but made a decision to move to VMware's ESXi hypervisor and its vCloud cloud controller last year. vSAN was just icing on the cake when VMware announced it last summer.

"We have wanted for a long time to move to converged infrastructure," says David Sampson, CTO at Itrica. "We have had all of this local disk capacity in our host machines that has gone largely unutilized in our environments for years. Using some of the more traditional SAN infrastructure creates some bottlenecks for us with the lack of redundancy and performance challenges. In a multi-tenant environment, we need to be constantly rightsized for the workloads that customers are bringing to us. That means we need the most amount of storage with the highest I/O rates possible. So getting to a converged model is a bit of a holy grail for us. We are serving customers storage from local hosts with vSAN there is very little overhead from a systems standpoint. And it is redundant across the architecture, so if a single host machine fails, the data is already replicated and from a process standpoint, now with storage we react the same way as when a host machine fails. We fix the machine as time allows because there is no impact to our customers."

Sampson is not thinking of this as some kind of entry-level SAN. "We will push the limits," he says, and like many other customers was testing vSAN on an eight-node cluster of servers equipped with flash and disk storage. To get the most I/O operations per second for the online clinical trial applications, the host nodes running applications will be equipped with flash-based SSDs, while clusters dedicated to backup and recovery will be equipped with a mix of disks (for cheaper storage) and flash (for caching).

It is clear why VMware is excited about delivering the Virtual SAN product, but you might be wondering why its parent, EMC, which is the dominant supplier of actual SANs, is eager to see VMware succeed at become a server vendor.

The reason is simple: EMC can't stop storage virtualization, and as the rise of Nutanix, Scale Computing, and SimpliVity demonstrates, it cannot stop upstarts from outside of the EMC-VMware collective from taking on EMC's real SAN business. And so, in many situations, VMware's vSAN will be taking business away from the traditional SAN suppliers, including EMC. This is a lot better for EMC, which owns 85 percent of VMware, than losing deals to Nutanix, Scale Computing, and SimpliVity, which sell converged and virtualized server-storage clusters, or losing deals to a host of other virtual storage array appliance makers.

As Pat Gelsinger, VMware's CEO and the former head of Intel's enterprise chip business as well as its first CTO, explains it, VMware has some advantages when it comes to virtualizing storage and managing data.

VMware CEO Pat Gelsinger

VMware CEO Pat Gelsinger

"The hypervisor gives us a unique position to think about storage," explained Gelsinger at the launch event for vSAN. The virtual machine container riding atop the ESXi hypervisor, said Gelsinger, allows VMware to peer into applications and see what is going on in there as well as looking down below the hypervisor to see what is going on in the underlying servers, storage, and networking infrastructure and – here is the important point – sit in the I/O path between storage devices and the virtual machine containers. "Nobody else in the stack has the potential to enable storage in this way."

Well, yes and no. Nobody else who is selling virtual storage arrays that run inside of virtual machines can do it in this way, but Microsoft, Citrix Systems, and Red Hat, which control the Hyper-V, Xen, and KVM hypervisors, can – and probably will – get some sort of virtual SAN software running inside of their hypervisors at some point as VMware has done. But once again, VMware has a huge lead over its rivals and, thanks to its ownership by EMC, has a deep pool of storage technology and technical expertise that it can pull from.

The vSAN software is built right into the ESXi hypervisor kernel and is not running up inside of a hypervisor as are solutions from Nutanix, Scale Computing, and SimpliVity. Technically speaking, the vSAN software is a distributed object store, explained Ben Fathi, CTO for storage at VMware, and you put VMDK file systems right atop the disk and flash drives. The flash drives can be used to boost the performance of an application using the Flash Read Cache feature, or flash can also be used to speed up the replication of data across multiple hosts in the cluster of servers that implements the vSAN. Data is replicated across two hosts automatically and can be set up to replicate across three hosts if you want an extra measure of protection against failure. Within a host, if you want more performance out of the local disks, you can stripe data across two or three drives in a system to get parallel access to files if you don't want to go with flash.


VMware was expected to deliver the production-grade vSAN software a few months from now, and only spanning a maximum of 16 nodes. But, the software will be made available sometime this week and will be able to span up to 32 nodes. The top-end vSAN system will be able to support up to 3,200 virtual machines across those 32 nodes, with up to 4.4 PB using 4 TB disk drives and delivering up to 2 million I/O operations per second of storage bandwidth.

VMware is being cagey about the packing and pricing for vSAN at the moment, but details will be available later this week when the software comes out. To help speed up adoption, VMware has worked with Cisco Systems, Dell, Fujitsu, IBM, and Supermicro to cook up 13 different server configurations specifically designed for running vSAN for particular workloads, including generic server virtualization, virtual desktops, and others. You can also build your own converged clusters using any items on the VMware hardware compatibility list.

We will track down the packaging and pricing for vSAN when it becomes available and report back.