Advanced Computing in the Age of AI | Friday, December 9, 2022

Distributed Systems and the Economic Power of Commoditization 

IDC recently reported that the number of data centers is declining and, at the same time, the total square-feet of raised flooring in data centers is also shrinking. The market research firm attributed this to the consolidation of data centers and the renting of server and storage resources from the public cloud.

While no doubt true, a deeper trend underlies this shift: the move to a new computer platform architecture. It’s called different things depending on the operating “stack” running on it: Hyper Converged Infrastructure, Software Defined Storage, or Scale-Out. And despite the varying nomenclature, what the names all have in common is the use of very dense, very cheap servers with direct attached storage (disk or flash).

Of course, it wasn’t that long ago that the opposite was true: state-of-the-art data centers were full of systems designed to get as much computing into a single system as possible, supported by sophisticated Storage Area Network-based shared storage systems or Network Attached Storage (NAS). These large systems were designed to run the largest multi-threaded business applications: ERP, personnel, sales management, supply chain management. This gear was made by Cisco, HP, EMC and Netapp, among others. These systems also used the highest featured Intel processors in gangs of four, and as much memory as could be put into a single server.

Then the new modern workloads broke the model. Even the largest scale-up computer systems couldn’t handle the needs of Google, Facebook, Apple, eBay, and others. To solve this problem these companies utilized distributed computing techniques to develop new application platforms that would “distribute” the storage and the processing (compute) to where the storage resided. The building block for this new approach was the commodity server with Direct Attached Storage (DAS) drives built right into the server. This approach is now the standard platform for distributed applications like Hadoop, Spark, Cassandra, Openstack, etc.

This platform also happens to be the target compute and storage platform used in the public cloud by Amazon Web Services, Google Cloud Platform and Microsoft Azure, among others. Disk-heavy commodity servers, for instance, are the basis for AWS’s S3 storage and they provide NAS services to commodity compute nodes in AWS’s EC2.

Since storage and compute are now scalable, bounded only by the size of the physical data center, there is no need to maximize the compute, memory or storage in a single server or small cluster of servers. Instead, the user wants the lowest cost per core in their processors, the lowest cost per gigabyte in their DRAM, and the lowest cost per terabyte in their storage.

At the current state of the art, this is a dual eight- or 12-core Xeon 5 processor-based server with single layer DIMMS yielding 64 or 128 gigabytes of DRAM per system, and 12x 6 terabyte 3.5-inch disk drives. Of course, such a system is far from the most powerful server or largest storage capacity available. But when you string 500 or 1000 or 10,000 of them together running Hadoop or Cassandra or S3 you get massive storage and processing capacity at the lowest possible cost.

In recent years, these servers have become a commodity offered by every computer systems company and they are virtually interchangeable. Not only that, these products have become the lowest-margin products in their product lines, improving the cost performance of these systems even more to the benefit of their customers.

Seeking to squeeze the cost structure even more, AWS, Azure, Google, Facebook and other hyperscale users of this technology have cut out of the supply chain the computer system houses like Dell and HPE by buying directly from the contract manufacturers, such as Foxconn, Quanta and Huawei. After all, the design of these systems is standardized.

This has created a temporary cost gradient between: 1) running apps on VMs in-house on traditional Scale-Up systems from HPE or Dell and EMC or NetApp – versus: 2) running them on VMs installed at AWS on the low cost commodity servers. As you might expect, we’re seeing a huge move to the public cloud as a result.

To date it has been complex and therefore difficult and expensive to deploy a large scale-out environment like AWS’s or Google’s, but of course they have the scale and resources to pull it off. It has been prohibitively complex to use this technology on a smaller scale until just the past few years as new technologies have made it affordable and practicable for enterprise IT shops.

As commodity distributed compute and storage technology becomes easier to deploy and use, enterprises will gain access them and we will see the cost gradient between public and private cloud deployments equalize. Once that happens, the tremendous growth of the public cloud will moderate and we’ll see a concomitant resurgence in enterprise-based, on-premises, IT.

This is not only a phase change in cost/performance but also in compute and storage density because these commodity servers and drives pack more compute and storage into a much smaller space than traditional architectures. Thus there will be reduction in total data center floor space, as highlighted by IDC. Once this transition is complete, we will see this on-prem growth once again.

This transition will put tremendous pressure on the systems houses and the public cloud services providers as the economics push first one way and then the other, and margins are squeezed. The winner will be the users of the computing and storage resources that are experiencing this disruptive phase change, dramatically lowering costs of compute and storage.

Gene Banman is CEO of DriveScale.

Add a Comment