Advanced Computing in the Age of AI | Wednesday, July 24, 2024

NetApp Says Flash Is All About Predictable Performance 

Disk and flash array maker NetApp showed off a refresh of its E-Series machines at the SC13 supercomputing conference last week. The E-Series are not the biggest, baddest arrays that NetApp sells, but they pack plenty of performance, particularly after the new processor and flash upgrades. More importantly, says the company, the boxes have predictable performance on a lot of different workloads.

Plenty of enterprise shops use the E-Series arrays as their primary storage, given this, and in a number of cases companies are loading up the Lustre file system on top of E-Series arrays rather than on servers with lots of disk drives. This stands to reason, of course, because the difference between a compute server and a storage server is not always obvious, even after you squint your eyes a little. It all comes down to tuning the underlying server to run the storage software for very specific workloads.

The box that everyone was talking about is the EF550 flash array, which is the second-generation of flash arrays from NetApp. Dave Mooney, vice president of E-Series sales at NetApp, came over to the company when it bought the Engenio disk array business from LSI Logic two years ago for $480 million. Engenio and NetApp and their reseller partners have shipped over 650,000 arrays into the field over the years.

The prior generation EF540 flash array has been out for a little more than a year, and the company has seen good uptake for it. The machine is particularly popular among customers who want to accelerate databases, says Mooney. Prior to using flash, such customers have had to overprovision their database storage (in terms of raw capacity) because they need 300 or 400 disk drive arms flinging data to get the performance levels their applications require.

The NetApp flash arrays are also popular as burst buffers between systems and slower disk-based arrays as well as for metadata servers for storage clusters and other kinds of clusters. NetApp is also seeing an uptick in the use of the E-Series flash boxes for virtual desktop infrastructure (VDI) workloads, where disks are too slow to give decent response time over the network and too fat and hot as well to be a good way to store virtual desktops. Another interesting use case for the flash arrays is as a repository for code created by agile programming environments. Programmers all want to upload their code at the end of the day, and this can put a huge strain on disk-based arrays.

"The EF540 and EF550 do not have the very high IOPS that some of these startup arrays have," Mooney tell EnterpriseTech. "But when you look at the amount of IOPS in a 2U controller, we may be only 400,000 IOPS but we are doing it in that small space and they are doing it in maybe 5U or 6U. And if you want more IOPS, our attitude is you just lay more units down and you are still saving on rack space."

Cramming the most IOPS into a space is a bit of a game, but what customers really want, says Mooney, is a flash storage system that provides consistent low latency across a wide variety of workloads.

"Where we actually win over the competition is the very predictable latency," Mooney continues. "What they have found with other flash arrays is that when the flash gets a little bit older or when the arrays start to do garbage collecting, because we have had so many years of experience in managing fault and misbehaving drives, and running a very thin, real-time operating system, we can move the data very efficiently through the array. Some flash array builders are really good at corner cases. But when we say you are going to get sub-millisecond or several millisecond response, you know you are going to get it consistently."

Like other E-Series arrays, the EF550 flash array runs the SANtricity operating system, which Mooney says has about 20 percent of the depth of code as the high-end Data ONTAP storage operating system that is used on NetApp's higher end FAS SAN arrays. This software provides snapshotting, thin provisioning, remote mirroring, and LUN setup, among other things, but does not have deduplication or compression as other NetApp products do. The E-Series arrays are designed for supporting analytics and HPC workloads in particular where low cost and high throughput are the most important metrics and are also sold on an OEM basis to several server makers for their own storage lines.

EF550 has a two-socket controller that is powered by Intel's new "Ivy Bridge-EP" Xeon E5-2600 v2 processors, which offer a performance boost compared to the prior generation Xeon E5 chips used in the EF540. This controller card has 24 GB of its own memory. The eight Fibre Channel ports on the EF550 have been goosed to 16 Gb/sec speeds, double that of the prior machine as well; it also has eight 6 Gb/sec SAS ports, eight 10 Gb/sec iSCSI ports, or four 40 Gb/sec InfiniBand ports.

The EF550 array is using 800 GB Lightning solid state drives from SanDisk, which come in a 2.5-inch form factor and which are based on enterprise-grade SLC flash. The plan, says Mooney, is to move to eMLC flash in the future, probably next year, for the EF550 array. The base EF550 array is a 2U unit that has room for 24 2.5-inch SSDs, and with a maximum of four expansion enclosures that hang off of the base machine you can add up to 120 drives to a single controller for a 96 TB of flash capacity. The base unit has a burst I/O rate of 900,000 IOPS and a sustained rate of 400,000 IOPS and delivers up to 12 GB/sec of sustained throughput. A base EF550 loaded up with 19.2 TB of flash will cost on the order of $250,000.

The E5500 is the disk drive companion to the EF550 flash array. The E5500 comes in three different enclosures with various mixes of disk capacities, counts, and form factors and sports the same dual-socket Ivy Bridge Xeon E5 controller as the EF550. Here's what the three array controllers and their expansion enclosures look like:


As you can see, NetApp is supporting a mix of disks, and customers can also use the 800 GB SanDisk flash drives for tiered storage in the arrays if they so choose. The densest E5560 machine packs 60 disk drives in a 4U chassis for 240 TB using 4 TB disks, and with one controller and five expansion enclosures you can push it up to 1.54 PB of total capacity in six total enclosures. The E5524 is essentially the same chassis as the EF550, but with disks instead of SSDs and therefore because of the lower IOPS coming off disks, you can have up to 30 enclosures off this single enclosure. With two dozen the 1.2 TB 2.5-inch disks in the controller array and a dozen 4 TB 3.5-inch drives in the enclosures, you can have a total of 384 disks and a maximum of 1.47 PB. The E5512 is using 3.5-inch drives and can have as many as 192 drives using the twelve-drive enclosures, for a total of 768 TB. Depending on the configuration, the E5500 costs between $50,000 to $150,000, with a configuration with three dozen drives running around $125,000.

Finally, there is a new entry NetApp array, the E2700. This machine has a controller that is based on a dual-core Power processor that is made by LSI Logic. This Power chip supports RAID data protection and runs the SANtricity software as well. The same expansion shelves that hang off the E5500 above can hang off the E2700. Here are what the three different models look like:


Mooney says that the E2700 will be popular as object storage as well as for backup and archiving of data and for embedded products like surveillance systems. The sweet spot for these arrays is somewhere between $12,000 and $50,000.

The new NetApp E-Series arrays support Microsoft Windows Server, SUSE Linux Enterprise Server, Red Hat Enterprise Linux, IBM AIX, Oracle Solaris, Hewlett-Packard HP-UX, and VMware ESXi.

All three families of machines can be ordered now and will start shipping in January.