Using Flash To Scale In As Well As Up And Out
When people talk about scale in the IT industry these days, more times than not they mean one of two things. The first is called scale up, and this is the very old-fashioned way of making a computer get more work done. The second is called scale out, which is a slightly newer technique. And with the advent of flash and in-memory processing as well as all kinds of accelerators, EnterpriseTech would suggest that there is a third category, which has been dubbed scale in.
Scale up means taking a single computing and ramping up its processing, memory, and I/O capacity so it can do more work. We tend to think in terms of processors when we talk about scale, and symmetric multiprocessing (SMP) and non-uniform memory access (NUMA) are the two dominant ways of scaling up a system these days. With scale out, you break the job into pieces and more loosely couple the workloads running on the machines. The resulting system is far less efficient for certain kinds of processing, but for workloads that can be parallelized or distributed, it works just fine.
With scale in, a term coined last summer at the Flash Memory Summit by Ed Doller, chief memory systems architect at Micron Technology, the idea is to get a single box to do more work by getting its compute, storage, and I/O properly aligned. That usually means putting flash storage inside a system to accelerate its performance, and if you do it right, then the number of systems you need to host a particular workload can go way down. You get rid of the extreme scale problem by getting each node to do the proper amount of work and waste as little compute, memory, and I/O as possible. Then you can shard databases or use other load balancing techniques (depending on the workload) to spread a workload across multiple machines.
Over at Fusion-io, which is the dominant maker of PCI-Express flash memory cards for servers, this scale-on concept is called "consolidation" and with good reason, Gary Orenstein, senior vice president of products, explains to EnterpriseTech.
"It is not uncommon to have a ten to one ratio when going from a disk-based system to flash-based infrastructure," says Orenstein. "Not only do they cut the costs of having all of the nodes, but all of the associated stuff, such as needing less rack space, less cooling, and so forth further cuts costs."
In one example, an unnamed customer who runs an online service for text-based voting, which is resold to popular television shows and theaters to do online polling of viewers, was running its setup on 32 MySQL server instances for each workload. By ditching disk drives and moving to Fusion-io flash memory cards, this company was able to cut its server node count down to two per workload.
Facebook and Apple, which are online companies and which rely highly on user experience to keep people engaged with their electronic products, take flash acceleration very seriously. They have been Fusion-io's largest customers from the get-go, as you can see here in the revenue from the company in fiscal years 2009 through 2013:
Facebook and Apple are both also pretty secretive about what they do with Fusion-io storage, Apple much more so than Facebook. Both are using PCI-Express flash from Fusion-io to accelerate the databases behind their applications.
Facebook is a little more open about it, and has five basic server types – for web and chat, database, Hadoop, photo storage, and feeds, search, and ads. The database servers run MySQL, which Facebook knows like the back of its hand. (And Facebook invented the Cassandra NoSQL database because it knew some of MySQL's limitations, too.) Back in 2010, the Facebook MySQL database servers had data compression and could store 1 TB of compressed data on a hard drive. By 2011, the company added some flash to the systems as well as its Flashcache software, which it has open sourced, to keep hot data on the flash capacity. By 2012, the database servers at Facebook got rid of disk drives entirely and stored everything on flash. (This was good news for Fusion-io, as you can see from the revenue stream above.) The machines have two of Intel's Xeon E5-2660 processors and 144 GB of main memory, but they also have 3.2 TB of Fusion-io flash memory plugged into their PCI-Express slots. The feed servers at Facebook also include flash solid state disks mixed in with regular disks to do indexing and search functions. A rack of machines has 80 Xeon processors, 5.8 TB of memory, 80 TB of disk, and 30 TB of flash, and a couple of weeks' worth of News Feed indexes can be stored on the flash for faster access. The Fusion-io flash cards used in the database servers have lots more I/O capacity than the SSDs, which are limited to a certain extent by the disk controllers in the servers.
And this, says Orenstein, is precisely why Fusion-io didn't create SSDs but rather PCI-Express cards in the first place.
"Flash has always been about performance, but in our most recent customer survey, the number one reason was the consistency of performance," says Orenstein. "It is like the old adage in the stock market, that consistency of performance will trump all other factors over time. Getting rid of jitter is key, and that is why with Fusion-io, one of our guiding principles was to expose flash memory natively on the PCI-Express bus rather than on a controller that was made for disk drives because that native presentation gives us the ability to deliver low latency and consistent performance in the presence of variable workloads."
(This consistency theme keeps coming up again and again in extreme computing. Consistency of performance over low latency is precisely what the designers of the Lucera financial services cloud have said they are trying to deliver, and it is also what IBM believes it can do for high frequency traders who switch from X86 to Power servers to run their trading algorithms. Just to name two examples from recent weeks that EnterpriseTech has discussed.)
Fusion-io had nearly 6,000 customers as it exited 2013, and this scale-in computing boost is a key theme, whether you are talking to small businesses trying to goose the performance of a single server running a database or a virtual desktop broker. Fusion-io has done some dicing and slicing on storage and cloud market data from Gartner and IDC for enterprises and SMBs and mixed it up with its own data to categorize the market for application acceleration through storage this way:
Fusion-io does not yet own any of these markets because flash is still a relatively new technology, but clearly the company believes it can grow in the enterprise and SMB space and not be so dependent on Facebook and Apple for its revenues. (The launch of ioControl hybrid flash-disk arrays for SMBs as well as the ioVDI cards specifically tuned for virtual desktops are part of this expansion effort.)
The four key workloads that Fusion-io is trying to accelerate are server virtualization. relational databases, big data workloads based on NoSQL data stores, and virtual desktop infrastructure. Server virtualization among Fusion-io's customers is mainly for VMware's ESXi/vSphere combo, with Hyper-V from Microsoft on the rise and a smattering of XenServer from Citrix Systems in the mix. For VDI, VMware's Horizon View is the dominant desktop broker, with some Citrix XenDesktop. The key databases that get accelerated are Microsoft SQL Server, Oracle 11g and 12c, and MySQL; SAP's HANA in-memory database doesn't run on flash, but the majority of HANA appliances use Fusion-io cards for the database logging. Big data is dominated by Cassandra and MongoDB NoSQL databases. (The Spotify music service runs Cassandra on servers equipped with Fusion-io cards to get zippy and consistent performance; the ObjectRocket MongoDB service at Rackspace Hosting runs on machines with Fusion-io cards as well.)
In that survey of 800 customers that Orenstein referenced above (which is a very representative slice of the Fusion-io customer base), you can get a sense of who the company's current customers are and what they are doing with Fusion-io's flash products. Companies could pick more than one workload to accelerate, so the totals add up to more than 100 percent. A little more than half of those surveyed said they were accelerating SQL Server, with 25 percent goosing Oracle databases and 20 percent doing so for MySQL databases. Server virtualization was being accelerated by about a third of those polled, and big data workloads like Cassandra and MongoDB as well as other analytics workloads were being accelerated by only about 6 percent of customers.