Advanced Computing in the Age of AI | Wednesday, December 6, 2023

Making Sense of When to Use FPGAs 


While improvements in processors and software solutions have catalyzed major advances in high performance computing, the need for customizable performance is an ever-present challenge for organizations that rely on HPC systems.

The latest generation of HPC-optimized processors have delivered major advances in supercomputing performance. And although Moore’s Law continues to define how new generations of processors exceed its predecessors’, engineers are exploring ways to accelerate workloads through non-x86 processors, such as FPGAs, which have more than doubled the speed of HPC systems on targeted applications compared to an Intel Xeon or Intel Xeon Phi processor alone.

With roughly 50 billion devices in use globally, each of them pumping out data, this capability is increasingly important. In parallel, data centres must support a diverse array of workloads and networked components with the speed and agility to interpret all the resulting data. For organizations reliant on mission-critical information, capturing and evaluating it represents an enormous challenge. FPGAs are an important tool allowing them to gain immediate insights from large volumes of incoming data, filter it down to something analysts can interpret and take appropriate actions dictated by those insights.

Where do FPGAs fit in? The idea of offloading some of the workload from the main processor to a co-processor has long been established. However, the downside of a standard co-processor is its limited adaptability. A co-processor can be designed as an accelerator for a particular workload, but if that same co-processor is dropped into a different workload scenario – one it is not specifically designed to accelerate – it may not deliver the goods.

In contrast, the FPGA are chameleon-like. It has the innate ability to be configured in unique ways to address particular workloads. It’s like a child’s Lego building set, in which bricks, windows, wheels and flat plates can be combined to create a miniature city, then re-combined next week to create a toy spaceship. FPGAs are like this, offering extreme flexibility, customizability and functionality.

An individual FPGA packs onto its surface millions of configurable components, up to 100 times more computational units than conventional GPUs and CPUs, including logic elements, DSP blocks, high speed transceivers, memory blocks, memory controllers and more.

A “blank slate” FPGA can be programmed to utilize any or all of those configurable elements optimized for a specific workload. An FPGA can be programmed to serve as something relatively simple, such as additional high-speed memory, or software can define the FPGA for such high speed capabilities as a storage controller, network adapter, a pre- or post-processor, or to run single precision floating point operations for intensive parallel computing applications.

FPGA offload also is energy efficiency in two ways. First, FPGAs can take on tasks from the GPU or CPU to optimize the workload for system efficiency, for less energy consumption. Secondly, because the FPGA’s role is defined by software, only the needed onboard elements of the FPGA are activated, leaving the remaining configurable elements dormant.

FPGA programming can be done by qualified onsite experts with skills in such languages as OpenCL, customers acquire more tools to make the process easier.

Once the appropriate optimizations are coded, an FPGA can be configured permanently for a specific role or reprogrammed repeatedly to serve multiple purposes. If we take a hypothetical research lab, one department might require an HPC system to process enormous amounts of weather data to model weather patterns while another uses the same system for genome mapping. In milliseconds, software can be uploaded to the FPGA optimizing it for one research department and then, later, for another. FPGA’s on-the-fly flexibility simplify HPC configurations, leaving more time for research and less on system tuning. FPGAs can also save money since hardware components do not need to be changed out, and there’s no need for a redundant HPC system that needs to be configured for another department’s unique requirements.

The security industry also can benefit from FPGAs. That business – with constantly-changing security information, ensuing threat analysis, encryption, decryption and security mechanisms – demands mission-critical speed and customization options. Milliseconds count. FPGAs enable security experts to more quickly manage day-to-day security needs as well as potential emergent threats.

When Are FPGAs the Right Choice?

While FPGAs can generate improvements for many HPC and data centre applications, they are not a one-stop solution for every scenario. According to Mark Kachmarek, HPC platform marketing manager at Intel, workloads for which single precision and low-wattage performance acceleration are important are likely candidates for FPGA benefits.

Source; Intel

In contrast, a CPU with a many-integrated core architecture shows excellent performance for highly parallel computing workloads, such as simulation, molecular dynamics and weather, to name a few. And there are other workloads, such as life sciences, financial trading and particle physics data processing, where a hybrid FPGA and mainstream CPU packaged may provide the best approach.

How can an organization evaluate whether FPGAs will offer their data centre significant benefits? “While it is challenging to make a blanket determination without knowing the specifics around an organization’s workload,” notes Ian Land, formerly with Altera and now an Intel senior FPGA marketing manager, “there are some important criteria which can help determine if an FPGA’s contribution can better support an organization’s supercomputers. These factors include system-level, development-level and workload-level decisions.”

On a system level, FPGAs serve as a “flow engine,” providing real time parallel processing of data flows with low-latency. Multipurpose processors in high bandwidth networking scenarios take data stored in memory, process it, and store the results in memory again for further evaluation. By contrast, FPGAs can process that same data rapidly in real-time, before storage in memory, delivering benefit from faster insights and lower memory requirements for data analysis, reducing HPC system cost of ownership.

Additional workload-level decisions also impact potential FPGA benefits. FPGAs can process data for a target set of workloads more quickly than generalized processors. Common FPGA data centre workloads include filtering, compression, cryptography, data analytics and AI inference. An example is the filtering of parallel data streams for financial applications, which provide information to high frequency trading systems that make trades in fractions of a second.

Development capabilities of an organization frame another piece of FPGA criteria. Land reflects on the early days of FPGA programming, noting the expertise required. “FPGAs proved challenging for developers in the old days. Without specialized knowledge, few programmers could use them. Today, things are very different, we have early access programs so customers familiar with the Intel Architecture can put their expertise to work on systems with custom FPGA acceleration.”

Rob Johnson is the owner of Fine Tuning, LLC, a Portland, OR, marketing agency, and a former consultant to a Fortune 25 technology company.