Advanced Computing in the Age of AI | Friday, May 20, 2022

High Performance Data Storage: Keeping Up with Runaway Data Growth 
Sponsored Content by Dell EMC

To keep pace with rapid and unrelenting data growth, high performance computing needs high performance data storage.

Across a broad spectrum of the high performance computing world, data storage challenges are growing larger by the day. From life sciences and financial services to manufacturing and telecommunications, organizations are finding they need not just more storage, but ever-faster storage to meet the demands of today’s data-intensive workloads.

Industry observers agree with that point. When they talk about the storage challenges that lie ahead, they toss out words like “exponential growth,” “explosive growth” and “overwhelming data storage challenges.” That last phrase comes from Lawrence Berkeley National Laboratory’s National Energy Research Scientific Computing Center (NERSC) and the introduction to its “NERSC Storage 2020” report.

How overwhelming? Over the past 10 years, the total volume of data stored at NERSC has grown at an annual rate of 30 percent. NERSC warns that the future will bring much more of the same, fueled by “massive data rate increases from various detectors and sensors and the need for analysis, management, archiving and curation capabilities beyond what is common today.”[1]

The ever-growing HPC data storage challenge is evident in numbers from the technology marketplace. Global HPC external storage revenues will grow 7.8 percent over the 2016–2021 timeframe, to hit $6.3 billion in 2021, according to a Hyperion Research forecast. HPC server sales, in turn, will grow at a more modest rate of 5.8 percent over the same period.[2]

New technologies

The global reality of runaway data growth is fueling demand for new, faster storage technologies that can be collectively referred to as high performance data storage — a hot topic in IT circles. And there’s good news on this front:  Storage technologies and architectures are rapidly evolving to help organizations adopt these new technologies.

Let’s consider some examples:

  • Fast solid-state drives, such as those in the Intel® Solid State Drive family, are emerging as a new tier of high performance storage in HPC systems.
  • Non-volatile memory offerings, such as NVMe, NVMe over Fabric and Intel® Optane™ technology, enable memory-like latency and performance at storage-like capacity and cost.
  • Hyper-converged infrastructure solutions with software-defined storage, like those in the Dell EMC VxRail Appliance family, tightly integrate compute, storage and networking to reduce latency and accelerate throughput.
  • Emerging flash solutions promise to reduce latency and accelerate storage performance for unstructured file data.

And here’s even more good news: In many cases, prices are falling for NVMe, SSD, flash and other high performance storage media. This trend is putting high performance data storage within the reach of more organizations that want to leverage HPC solutions for applications like machine learning and large-scale data analytics.

Lustre keeps its shine

The open source parallel distributed file system known as Lustre, which is used widely in HPC systems, is alive and well today. In April 2018, the Open Scalable File Systems (OpenSFS) organization announced the release of Lustre 2.11.0, calling it “the fastest and most scalable parallel file system.”[3]

That’s good news for organizations running HPC clusters, because Lustre is renowned for its ability to scale to meet ever-larger storage workloads — including the biggest file system jobs around. It’s in many of the world’s largest supercomputers.

Capitalizing on Lustre

Among the users of the Lustre parallel file system is the Swinburne University of Technology in Australia. It is using Lustre in its OzSTAR supercomputer, one of the most powerful computers in the country. The massive HPC system from Dell EMC will enable the Swinburne-based Australian Research Council Centre of Excellence for Gravitational Wave Discovery’s (OzGrav) to search for gravitational waves and study the extreme physics of black holes and warped space-time.

OzGrav Director, Professor Matthew Bailes, says OzSTAR will be used to shift through reams of data and be powerful enough to search for coalescing black holes and neutron stars in real time. “In one second, OzSTAR can perform 10,000 calculations for every one of the 100 billion starts in our galaxy,” Bailes says.[4]

Calculations of that magnitude require not just amazing processor performance, but also amazing storage performance. And that’s the goal of high performance data storage — storage performance that stays in lockstep with processor performance. This is clearly one of the keys to driving ongoing advances in any industry that relies heavily on HPC — which is now just about all industries.

For a closer look at the technologies and products for high performance data storage, visit the Dell EMC data storage site.

[1] NERSC, “New 'Storage 2020' Report Outlines Vision for Future HPC Storage,” Dec. 1, 2017.

[2] HPCwire, “Hyperion: Storage to Lead HPC Growth in 2016-2021,” July 27, 2017.

[3] OpenSFS, “Lustre 2.11.0 Released,” April 3, 2018.

[4] Swinburne University of Technology, “Swinburne supercomputer to be one of the most powerful in Australia,” March 7, 2018.


Add a Comment