Advanced Computing in the Age of AI | Friday, March 29, 2024

‘Fabric Democratization’ Could Boost Enterprise Interconnects 

New fabric interconnect technology could be “a game changer,” according to market watcher International Data Corp. (IDC), with the potential to break the price/performance mold established by interconnects based on Infiniband and Ethernet.

IDC Research Vice President Steve Conway fleshed out the emerging interconnect approach during a recent webinar, focusing on Intel’s Omni-Path architecture (OPA) that has been in development for years and is scheduled to be unveiled during the SC15 conference in mid-November.

Omni-Path has potential implications for the enterprise because advanced fabrics have already been deployed by organizations with challenging big data workloads too complex or too time-critical to be handled by enterprise server technology. Conway said IDC has examined hundreds of large commercial enterprises migrating to extreme scale systems across the retail, healthcare and financial services industries, along with companies that require real-time fraud and anomaly detection.

“High end enterprises are starting to be driven by competitive forces into HPC technology,” Conway said. “The reason for that has a lot to do with the ability to move data around, which is where the interconnects come in.”

Fabric innovation is heating up because systems march to the beat of Moore’s Law. The extreme imbalance between compute power, on one side, and data movement and memory bandwidth, on the other, means it’s not unusual for systems to exploit less than 5 percent of peak performance while processors wait to receive or send data through the I/O bottleneck. Inefficient data transport results in in suboptimal system-level reliability, high power costs and low performance relative to system footprint. Tasked with the incredible complexity of handling communications between hundreds or thousands of nodes, interconnects hold out the possibility of bringing some semblance of systems balance.

“Everything in high-end systems, including interconnects, have failed to keep pace with processors,” Conway said. “It’s unsustainable to have processors operating at 5 to 10 percent efficiency forever.”

The result is a perfect storm: Data-unfriendly systems have coincided with the rise of what IDC calls high performance data analytics, the growing demand for real-time insight from mammoth datasets.

Best-in-class interconnects generally deliver 15 percent IO performance improvement over those based on Infiniband and Ethernet standards, according to Conway. But they’ve penetrated less than 15 percent of the market due to higher cost, limiting their use to high-budget sites where their capabilities are indispensable.

While Ethernet- and Infiniband-based vendors are improving fabric performance, system imbalance continues to widen, Conway said. Several best-in-class initiatives are underway around the world at Cray, Fujitsu, NEC, EXTOLL (Germany), Numascale (Norway) and the Atos Bull cloud unit.

Meanwhile, Intel's Omni-Path is aimed not only at HPC but also at the enterprise mid-market. It brings with it the potential to break the price/performance interconnect mold with best-in-class performance at pricing “notably lower” than Infiniband and Ethernet, according to Conway. Although Intel has yet to provide full details, the company claims Omni-Path will deliver 100 Gbps throughput with a 33 percent latency reduction.

The Omni-Path architecture is built upon not only Intel’s True Scale fabric, launched in 2012, but also on the acquisition of interconnect assets of Aeries IP from Cray and from QLogic in 2012 along with Fulcrum in 2011. Conway noted that 74 Cray interconnect engineers came to Intel at the time of the acquisition.

A major advantage of Omni-Path is integration, expected to become tighter over time, with Intel Xeon processors, the most widely used in the datacenter. CPU-fabric integration means increased scalability with reduced latency, lower power consumption, smaller footprint and lower cost thanks to Intel's manufacturing economies of scale.

Assuming Intel delivers on its promises, Omni-Path still faces market acceptance challenges. They begin with asking Infiniband- and Ethernet-based customers to rip out their fabric wiring. While vendors consistently reassure customers that “migration will be fairly painless,” Conway added, “that can describe a wide range of pain.”

Conway also noted that Omni-Path was developed in collaboration with the Open Fabric Alliance, meaning OPA will be compatible with a wide range of Infiniband applications. Intel has said applications based on the Message Passing Interface will port to Omni-Path without changes.

In additional, Intel is recruiting a lineup of server and storage OEMs who will develop versions of their products based on OPA. This includes DataDirect Networks (DDN), the most widely used high-end HPC storage vendor that is seeing healthy growth among in financial services, Fortune 100 businesses, web and cloud customers. DDN announced an Omni-Path-enabled storage architecture, DDN Omni-Connect, in October.

Another challenge remains for organizations considering adoption of advanced scale computing: motivating OEMs to pass on fabric cost savings to their customers. Whether that happens remains to be seen. The best-case scenario for technology buyers: Omni-Path price points put pricing pressure on other interconnect vendors, said Conway. “What it all amounts to is if Intel can deliver on its claims then this could have a real democratizing impact on the market.”

Currently, an estimated 7 percent of HPC budgets are spent on interconnect technology. Less expensive fabric means budgets would be freed up for more processors. Since high performance interconnects enable better exploitation of processors, overall system performance would improve substantially. Still, system imbalances will persist even with Omni-Path or any other advanced interconnect.

“That horse is out of the barn,” Conway noted. “We’re not going to return to a nearly one-to-one balance that we saw in the era of monolithic vector supercomputers. The problem of compute centrism has gotten so extreme, and systems are so out of balance today that the goal with the new interconnects is really to alleviate the problem rather than to eliminate it.”

About the author: George Leopold

George Leopold has written about science and technology for more than 30 years, focusing on electronics and aerospace technology. He previously served as executive editor of Electronic Engineering Times. Leopold is the author of "Calculated Risk: The Supersonic Life and Times of Gus Grissom" (Purdue University Press, 2016).

EnterpriseAI