Advanced Computing in the Age of AI | Wednesday, April 24, 2024

Nvidia Touts DPU Efficiency in Datacenter Use 

As Moore’s law slows, chips and datacenters are devouring more power than ever to maintain performance gains – and so chipmakers, programmers and system operators look increasingly to specialization for efficiency. Nvidia pitches its BlueField data processing units (DPUs) as one such category of devices, promising the same benefits for networking, storage and security (tasks typically offloaded to CPUs) that GPUs provide for graphics. With BlueField-2 in its second year and BlueField-3 on the way, Nvidia highlighted today the efficiency benefits of using its DPUs in place of CPUs for specific tasks.

The measurements were made using real-world tests with several major tech firms. Nvidia worked with Red Hat to test BlueField-2 DPUs for virtualization, encryption and networking tasks on Red Hat’s widely used OpenShift containerization software. The companies found that the BlueField-2 reduced networking demands on CPUs by 70% and accelerated those jobs by 54×. Nvidia told HPCwire that the company is continuing to run DPU tests with Red Hat and is hoping to measure specific server power consumption differences soon.

Red Hat OpenShift, running on Red Hat Enterprise Linux CoreOS, offloads both the networking data plane and control plane to the BlueField-2 DPU, via OVN and OVS. The DPU is running Red Hat Enterprise Linux on its Arm cores. (Credit: Nvidia)

Similarly, Nvidia and VMware ran tests of the BlueField-2 supporting VMware’s new vSphere 8 virtualization software, with the DPU handling a Redis key value store. Nvidia reported that, with 36 Redis streams running, the DPU-equipped server was 3.5% faster and used 12 fewer CPU cores (out of 64 previously busy cores). While the DPU does use extra power – Nvidia estimates it at +65W per server – the company also points out that a double-digit percent reduction in necessary servers translates to far more substantial power savings.

“If you look at the BlueField data processing unit and you compare it to a CPU, you’re sort of like, ‘oh, well, that consumes more power wattage than a traditional CPU,’ and you think to yourself: do I want to consume additional power?” said Ami Badani, vice president of marketing at Nvidia, in an interview with HPCwire. “But then when you look at it in totality with the CPU core savings or the reduction in servers, you actually realize you’re consuming less power watts in totality than looking at it as a single, individual unit. That, to us, was the highlight we wanted to make as we did these studies.”

For instance, Nvidia suggested, a 10,000-server Redis-on-VMware deployment, if 15% fewer servers were necessary, could save tens of millions of dollars in combined capital expenditure and energy savings. A similar use case was highlighted in a session by Nvidia and Comcast at Nvidia’s spring GTC event, highlighting work by Comcast to use DPUs to offload DNS encryption and reduce the number of necessary servers.

Nvidia also tested the IPsec datacenter encryption protocol on BlueField DPUs, finding a 21% power consumption reduction for servers and a 34% reduction for clients – again, they say, with implications for millions in savings across a few years.

Relatedly, Nvidia highlighted work with Ericsson to test offloading of a 5G user plane function to Nvidia’s ConnectX SmartNICs – which serve as components of the full BlueField DPUs. In the Ericsson test, the company reported that server CPUs consumed 24% less energy – with implications for millions in power savings for a large datacenter over several years.

“As energy prices continue to rise, there is a growing sense of urgency among communication service providers to find and implement innovative solutions that reduce network energy consumption,” wrote Ericsson CTO Erik Ekudden in an article discussing the work. “Since datacenters represent a sizable share of overall network energy consumption, any technique that could reduce the amount of energy they consume would have a significant positive impact.“

This sentiment is echoed in the DPU power efficiency white paper released by Nvidia, which points out that datacenter electricity consumption is projected to increase from ~1% of worldwide electricity to 3-13% of worldwide electricity; further, the paper reads, electricity costs are increasing in many locations. In a world facing rising energy costs and increasing demand for green IT infrastructure, the use of DPUs will become increasingly popular to reduce TCO by decreasing both CapEx and OpEx in the datacenter,” the paper concludes. Nvidia is hoping that these studies will spark more interest in solutions to the looming crisis, and the company says that it is working on similar research regarding GPUs.

While all of these tests (minus the Ericsson case) were run on BlueField-2 DPUs, Nvidia’s BlueField-3 DPU is expected next year – and Nvidia expects even more improvements for datacenter efficiency. “With every next generation of BlueField, I think it’ll get better in terms of compute power, it’ll get better in terms of performance, and all of that will translate into power efficiency,” Badani said. “That’s some of the work that’s underway: how do we benchmark? Now that we have BlueField-2, the next generation of this [study] will show BlueField-3, and I think the expectation is we’ll probably have better power efficiency as a result.” To learn more about the BlueField-3, click here.

Nvidia BlueField-3 DPU.

EnterpriseAI