Advanced Computing in the Age of AI | Saturday, April 20, 2024

‘Implicit HPC’: Using HPC for AI, Advanced Analytics – Without Knowing It 

With greater capability and improved ease of access, HPC cycles are more commonly being put in the hands of engineers and scientists who aren’t computer experts.  The infrastructure under development allows users to leverage the power of HPC without knowing that’s what they’re doing – it’s called “implicit HPC,” and it has major implications for commercial/industrial organizations pursuing increasingly advanced AI and big data analytics workloads.

When supercomputers came on the scene decades ago, their domain was almost exclusively scientific simulation. Over time, commoditization of hardware and the evolution of software stacks that run on it has made HPC more broadly understood and accessible. Although “supercomputing” and “HPC” are used interchangeably, supercomputers refer to the largest, most powerful HPC machines and tend to be used by computer science or computational science experts. As HPC has matured and democratized, multiple tiers of usage have emerged.

Here’s a breakdown:

Supercomputing – Often found in national labs and government facilities doing science on a grand scale (e.g., atomic or astronomic simulation). The focus is on performance and scalability, on achieving breakthroughs in science, engineering and the advancement of computing itself. A significant learning curve is usually involved. These systems run at the highest rates of floating point operations per second (flops) – 10s to 100s of petaflops (10^15) or greater (Summit, currently the world’s fastest supercomputer, is 200 PFlops).

Mid-level or Enterprise HPC -- Found in Fortune 100 companies, such as oil and gas, aeronautics, automotive, medical and financial services. The focus is on using HPC for complex, large-scale workloads with real- or near-time delivery of results. HPC systems performance spans from teraflops or low petaflops speeds (roughly 6.7 to 0.87 Pflop/s).  Those systems are still sufficiently fast to deliver more than adequate compute power for most enterprises.

“Implicit” HPC – Found in a variety of companies including those dealing with big data, analytics, and machine learning. The focus is on HPC end user transparency and on results rather than on throughput or the computing process.  Interestingly, some supercomputer centers are starting to provide implicit HPC portals. These can scale to 895 Tflop/s (0.89 Pflop/s) with systems like the Bridges system at PSC.

With implicit HPC, researchers, designers, marketers and analysts with little expertise in computational science can access vast computing resources. It is an environment free of the need to configure an application, build the application for a particular machine or even think about whether there are computing cycles available.

Implicit HPC is a relatively new phenomenon. As recently as five to seven years ago, when a business bought an HPC system, boxes of expensive equipment would arrive, technicians would show up to connect it – for a price – and then the business would be left to figure out what users needed, what software to run, and where to find that software.

While the desire for HPC proliferated, the cost, frustration and expertise involved in installing and maintaining HPC systems was a roadblock. Since then, vendors, researchers, manufacturers and end-users have begun discussing their issues.  They also worked on providing recipes for what an HPC machine should look like, first with one another, and then through global communities, such as OpenHPC, which work to integrate commonly used HPC components for a cohesive, comprehensive software stack, and is freely available in an open source distribution.

These and other efforts are making HPC more broadly accessible. Market demand has commoditized hardware and software. Education has kept pace, and career paths have been established. Today, it is easier to put together an HPC cluster, and many people are active in the global effort to create computing ecosystem recipes for HPC systems. The integrated software stack that OpenHPC released to the public is being embraced and downloaded at the rate of hundreds per month.

Businesses interested in HPC can readily find specifications and recipes to stand up an HPC machine. With a baseline capability in place, businesses have plenty of latitude for customization.

Data has become the center of computation

Big Data is another significant implicit HPC driver . Because of the size of datasets, engineers and scientists are interested in technologies in which the computation can be brought to the data, instead of the other way around.   To complement this, HPC centers are providing tools allowing users with little or no computer or computational science expertise to leverage both the data as well as the HPC resources. Enterprise users are now more able to dive into customer behavior and sales trends, design better products and optimize their supply and delivery.

Portal-style HPC models, such as gateways at Pittsburgh Supercomputing Center and San Diego Supercomputing Center, can take requests as intuitively as, “This is my dataset, and this is the kind of analysis that I want to run on it.” With the HPC system and its operations hidden from the user, the mechanisms behind the portal allocate the computing resources to perform the analysis.

Another example is MIT’s Lincoln Laboratory Supercomputing Center (LLSC), where analysts, engineers and scientists are tapping into New England's most powerful supercomputer from their desktops. LLSC engineers created software and tools that make interacting with supercomputers as user-friendly and accessible as computing on a desktop system. The software does the work; researchers simply submit their problems using techniques they are familiar with and get answers fast.

Where to go from here?

Related to implicit HPC is the convergence of HPC and machine learning. Pointing ML algorithms at massive data sets used and generated by HPC applications allows greater understanding to be gleaned from the data, often with less expertise required of the end-user. For greater efficiencies, scientists are looking to leverage the same hardware for colocation of machine learning frameworks and HPC applications.

For example, disease epidemiology offers a tantalizing glimpse of a future in which HPC and machine learning work in concert, still in the background, with major impact. After marshalling massive amounts of data on a wide array of factors, including weather, consumer behavior, travel patterns, and work attendance, along with simulation of the spread of germs, it may be possible to predict where flu outbreaks are likely to occur.[1] If HPC clusters performing machine learning were accessed by virtual assistants, such as Apple’s Siri or Samsung’s Bixby, study results of that kind would extend far beyond the scientific community into homes, schools, government, and businesses.

For businesses with large-scale enterprise analytics needs, combining rich data sets with implicit HPC could expand existing markets or point the way to new ones.

Dr. Robert Wisniewski is chief software architect, Extreme Scale Computing, Intel Corporation.

[1] Two examples in this space include “That Tweet You Just Sent Could Help Predict a Flu Outbreak” and “Can Machine Learning and Crowdsourcing Fight the Flu?

EnterpriseAI