Advanced Computing in the Age of AI | Thursday, March 28, 2024

Xilinx Aims Its Newest Versal HBM Accelerators, Coming in 2022, at Big Data Compute Workloads 

Just a month after unveiling its latest Versal AI Edge accelerators, FPGA maker Xilinx has announced its newest device family, the Versal HBM series, which includes fast memory, secure connectivity and adaptable compute in a single platform.

Built for use in data centers, networking, aerospace, defense and a wide range of markets and processes where high-performance is required, the Versal HBM (high bandwidth memory) series is the company’s latest adaptive compute acceleration platform (ACAP) for enterprises.

The Versal HBM series, which includes two models that come in five memory configurations, integrates a heterogeneous accelerator with memory that is adjacent to the compute, alleviating some typical and debilitating compute bottlenecks, Mike Thompson, the senior product line manager for Xilinx Versal Premium and HBM ACAPs and for Virtex UltraScale+ FPGAs, told EnterpriseAI.

Mike Thompson, Xilinx

“And with the rest of the platform here, we have scalar engines with integrated ARM Cortex A-72 application processors, real time processes with the Arm Cortex R-5S, a platform management controller that does configurations, secure boot, power management, this sort of thing,” he said. “There’s a lot of adaptable hardware, which is programmable logic from FPGAs, which does really fast, highly-parallel, high-performance acceleration that can adapt much more than ASICS or ASSPs that are more fixed-function.” Also included are DSP accelerators, hardened engines for high-speed connectivity and more.

Architected for the higher memory needs of the most compute intensive, memory bound applications for data center, wired networking, test and measurement and aerospace and defense, the Versal HBM series ACAPs include HBM2e DRAM, which provides 820GB/s of throughput and 32GB of capacity for 8X more memory bandwidth and 63 percent lower power consumption than DDR5 implementations, according to Xilinx.

Built on the foundation of the earlier Versal Premium series, the Versal HBM devices offer 5.6Tb/s of serial bandwidth with 112Gb/s PAM4 transceivers, 2.4Tb/s of scalable Ethernet bandwidth, 1.2Tb/s of line rate encryption throughput, 600Gb/s of Interlaken connectivity, and 1.5Tb/s of PCIe Gen5 bandwidth with built-in DMA, supporting both CCIX and CXL. Combined, the capabilities provide off-the-shelf compatibility with a wide range of protocols, data rates and optical standards needed by developers looking to use the HBM devices in production.

One of the most important features of the Versal HBM series is its adaptive, heterogeneous compute platform, which is built to accelerate a wide range of workloads that use large data sets. Versal HBM series devices can dynamically reconfigure hardware in milliseconds to adapt with evolving algorithms and emerging protocols, eliminating the need for hardware redesign and re-deployment, according to the company.

Hardware developers can use Xilinx’s Vivado Design Suite to work with the Versal HBM devices, while software developers can use the Xilinx Vitis unified software platform. Data scientists who want to use Versal HBM can design and build using the Vitis AI platform and take advantage of its domain-specific frameworks and acceleration libraries.

Built on the same 7nm fab (TSMC) as other Versal series devices, developers can start prototyping on Versal Premium series devices and evaluation boards and migrate later to the Versal HBM series when they become available, said Thompson. The Versal HBM series are expected to be sampling in the first half of 2022. Documentation is available now and tools will be available in the second half of 2021 via an early access program, according to the company.

Why Versal HBM Was Created

“There are three major trends that led us to architect this product,” including rising network traffic growth, an exponential growth of data to be processed and data security that could not keep up with those demands, said Thompson.

“The challenge with white boxes, especially in servers, is that they are really good at being general purpose, but they are not very high performance,” he said. “They are really good at doing generic, specific things. For a lot of the applications that are really driving the market – recommendation engines, fraud detection, intrusion detection, database acceleration – they do not offer nearly the performance that the algorithm deployers, the people who are running these applications and algorithms, really need.”

Xilinx Versal HBM block diagram.

The Versal HBM ACAPs, like Xilinx’s other ACAPs in the Versal family, include FPGAs as components, but they include far more technologies which carry the extra loads and deliver the extra acceleration and performance of the company’s ACAP families, Thompson said.

“There are FPGAs contained within them, but we believe that through the level of heterogeneous integration, it really is a new class … that are far and away beyond what traditional FPGAs have done,” he said.

The 7nm Versal HBM chips use FinFET (fin field-effect technology) process technology, giving them higher performance based on the latest process manufacturing techniques, said Thompson.

Jim McGregor, analyst

Jim McGregor, an analyst with TIRIAS Research, told EnterpriseAI that the latest Xilinx Versal HBM devices are built for today’s growing demands for accelerated computing.

“Typically, when we think of accelerated computing, we think of the workloads that we have to run on a server or a system or device,” said McGregor. “But there are a lot of workloads within a server that can be accelerated through offloading them – the network management, the security, some of the system overhead and management overhead – all this stuff can be offloaded” from the system’s CPU to a Versal HBM to speed up the work that is being done.

“Absolutely, especially if you are providing data center services like an Amazon or Microsoft Azure or somebody like that,” said McGregor. “The more you can offload these functions and or process data coming through the network faster, the more you are going to be able to leverage the other parts of the system, the host processors to GPUs or FPGAs or other types of accelerators that are actually generating money. The more you can offload those, the higher the return on your investment. It is basically like a network processor on steroids.”

Asked if Xilinx was announcing the new HBM series a bit early, McGregor said the early unveiling is to the company’s credit because it is one of the few vendors that provides a full-year rolling calendar of when and how it will release its products.

“It is not like they are doing things at the last minute,” he said. “They are trying to build up to a cadence” that can help developers and customers plan their implementation and deployment strategies for the hardware and software.

Some Versal Series History

The Versal ACAP architecture, which debuted in 2018, brought needed technology to the company’s earlier Versal chip families and is now being miniaturized to provide performance and low power requirements in a wide range of edge environments.

ACAPs are adaptive SoCs which combine scalar engines for embedded compute, adaptable engines for sensor fusion and hardware adaptability, and intelligent engines for AI inference. Also included are fast memory and interfaces that help deliver acceleration for applications.

Xilinx’s Versal AI Core and Prime series were announced in 2018 and are in full production, while the Versal Premium series was unveiled in early 2019 and is shipping to early access customers. The Versal AI Edge family, which is aimed at edge workloads, was announced in June. One more Versal family – the AI RF series – is planned for after the new HBM series is released.

Things got even more interesting for Xilinx in the marketplace in October of 2020, when it announced that it is being acquired by chipmaker AMD for $35 billion in an all-stock transaction. The acquisition helps AMD keep pace during a time of consolidation in the semiconductor industry. GPU rival and market leader Nvidia acquired Mellanox (interconnect) and continues to make its way through a deal to buy Arm (processor IP).

EnterpriseAI