Advanced Computing in the Age of AI | Tuesday, October 3, 2023

OpenStack Havana Makes Clouds Scale Globally 

The OpenStack community that it developing the open source cloud controller by the same name has rolled up the "Havana" release of the tool and made it available for others to commercialize or use as they see fit. The update has a number of scalability enhancements that will allow companies to build and manage very large clouds – often on a global scale.

The more important scalability improvement with the Havana release is for storage. The "Swift" object storage controller that has been part of OpenStack from its inception in July 2010 and which is known as OpenStack Object Storage formally, now has replication services that allow it to replicate files across multiple, geographically distributed data centers for either high availability or for scalability.

Jonathan Bryce, executive director at the OpenStack Foundation and one of the key people at Rackspace Hosting who helped launch OpenStack along with NASA, tells EnterpriseTech that the replication services for Swift object storage that were available with the "Grizzly" release from April have been extended.

Swift is what is called an "eventually consistent" object storage system, which means that it is not appropriate for transaction processing but that it is perfectly suited for storing all kinds of files that are part of applications. With the Grizzly release, the replication software inside of Swift worked offline and would eventually get two different Swift storage clusters in synch. With the Havana updates to Swift, you can designate distinct physical storage clusters running in different geographical regions and connected by wide area networks as a single Swift cluster. In the event that one data center or one region in a data center gets knocked out, applications will be able to go through the wide area network to retrieve objects.

"This is something that the community has been working towards for the past year and a half," explains Bryce. "All of the components and the replication services have come together so you can merge those geographically dispersed environments into what looks and acts like a single Swift storage cluster."

Hewlett-Packard, IBM, Intel, Rackspace, Red Hat, and SwiftStack (which is providing commercial support for Swift as a standalone product) all contributed to the Object Storage enhancements over the past year and a half. Swift 1.8 allowed for object storage availability zones to be grouped into regions and for more flexible replication to provide better access to data across those regions. With Swift 1.9, the replication could be done on its own network and code was added to control the read and write affinity to regional Swift clusters so customers would read from and write to the data centers closest to them and thereby avoid longer latencies. With Swift 1.10, including in the Havana release, the regional Swift clusters look like one giant cluster to applications and users, masking the replication and the geographical distribution of those clusters.


On the compute front, OpenStack continues to scale well in production environments. Remember that one of the reasons why NASA and Rackspace decided to do OpenStack in the first place was because the scalability of the open source Eucalyptus cloud controller being used by NASA Ames for its Nebula cloud was somewhat limited. Rackspace had its own cloud controller, but knew that if it wanted to grow its public cloud business, it would need something better, too. The design goals for OpenStack were lofty: It would need to eventually scale to over 1 million physical servers and over 60 million virtual machines.

Mark Collier, chief operating officer at the OpenStack Foundation and also a former Racker, says that OpenStack is not at those limits yet, but the issue is not just technology but the difficulty of testing the scale.

"We don't have hundreds of thousands of servers to test on," says Collier. "But we do know from talking to people in the real world that it scales well. BlueHost, a hosting provider, is running OpenStack across 20,000 physical machines. And we see examples where tens of thousands of servers are being run in the real world under OpenStack. I talked to another user yesterday in China that is running OpenStack across twenty different data centers. The big numbers are always exciting, but we sort of know what the limits are as we reach them and then we move forward to architect around any bottlenecks. Our user base continues to grow and we continue to see clouds with many thousands of servers being built."

This scalability is enabled through a feature of OpenStack's "Nova" compute controller called Cells, which is an abstraction layer that allows a collection of OpenStack clouds to be managed as a single cloud. Rackspace was the first major cloud provider to push OpenStack to its scalability limits, starting with the "Essex" release from April 2012, which Rackspace put into production in August 2012, and continuing with the "Folsom" release in September 2012 and the Grizzly release in April 2013. And Bryce says that a lot of the work on Cells was done by Rackspace as well as by others.

The Nova compute controller portion of OpenStack has also been updated to include the Docker variant of Linux containers as a "hypervisor" option. (We put that in quotes for a reason that will be obvious in a second.) Linux containers, just like Solaris containers or zones before them, are a lighter-weight form of server virtualization than a traditional hypervisor and its virtual machines. With a hypervisor, you typically run it on bare metal and it creates full virtual machines that can in turn run full instances of operating systems. With a container, the idea is that you already plan to run the same operating system in all of the virtual machines, so instead of loading up the heavy hypervisor, you run virtual machine sandboxes on top of the operating system kernel and you have them share one copy of the file system. The containers look and smell and act like virtual machines as far as application software is concerned, but they are not running on a hypervisor but on a Linux operating system. (Hence the quotes.)

As you can see from the Havana release notes, two modules that were incubating during the Grizzly cycle have made it to production grade.

The first is the "Heat" template-based orchestration engine, which is now called OpenStack Orchestration. Heat is written in Python, and it is used to create templates that describe an entire software stack and how different pieces of the stack are interrelated. If you modify a template, you update the deployment, thus getting admins out of the job of moving bits of software around to do patches and upgrades. Heat is used to manage the infrastructure portion of the stack and integrated with open source tools such as Puppet and Chef to manage the applications running on top of that infrastructure. Heat has a command line interface that can execute CloudFormation APIs, which is the analogous infrastructure templating system on Amazon Web Services.

The second new module is called "Ceilometer," and it collects all manner of usage data on an OpenStack cluster so it can be pumped into monitoring and metering systems. The idea is to have one consistent data collection program that can in turn hook into monitoring tools and billing systems for chargeback. This feature is now known as OpenStack Metering.

The Havana release had over 910 contributors from 145 different organizations contributing code. It includes 392 new features and more than 20,000 patches since Grizzly was launched in April. Interestingly, more enterprise users who are not trying to directly sell OpenStack support or services are contributing to the project, with CERN, Comcast, Intel IT, NeCTAR, PayPal, Shutterstock, and Workday all singled out as big contributors this time around. The top contributors in the Havana release are Canonical, Dreamhost, eNovance, HP, IBM, Intel, Mirantis, the OpenStack Foundation, Rackspace, Red Hat, SUSE Linux, and Yahoo.

The OpenStack Foundation will host its next development summit in early November in Hong Kong, where it will hammer out the features for the "Icehouse" release due in April 2014. Development is expected to focus on speeding up and automating deployment of OpenStack, bare metal provisioning of servers for workloads on OpenStack clusters that are allergic to virtualization, Hadoop integration, and improved messaging. We will look into these efforts after the summit is over and the plans are made.