Advanced Computing in the Age of AI | Saturday, May 18, 2024

IBM Using AI to Monitor Petabytes of Network Traffic 

As networks and applications become more complex, so does the process of unravelling performance problems. And as the complexity grows, even finding common errors in an SD-WAN infrastructure can be a challenge in our virtual-everything world. To help solve this conundrum, IBM launched its SevOne Network Performance Management (NPM) services on Dec. 9 (Thursday).

The new services arrive as network performance management has changed over the years, says Andrew Coward, general manager of software defined networking at IBM.

“There’s a lot more subtlety today than there was historically. Networks have evolved, if you like,” he tells Datanami. “[In the past], if something broke, if a link went down, there was a red light and you knew you had to fix it. Things today are a lot more subtle.”

When your application isn’t working, and there is no red light flashing on the console, what is an enterprising customer to do? If you’re an AWS customer and your application was running (or not running) in the US-East-1 data center on Tuesday, then you probably know why your customers were complaining about stranded Roombas and cat-food dispensers.

Step one on the decision tree is figuring out whether your problem is a network problem or an application problem. However, simply isolating the problem to the network only gets you so far. IBM’s new SevOne NPM offering is designed to get you the rest of the way.

SevOne is largely based on application and network performance management technology that IBM obtained with its acquisition of Turbonomic earlier this year for a reported $1.5 billion to $2 billion. On top of Turbonomic’s core APM and NPM kit, IBM added the capability to enable the software to gather data from other sources, blend it with the network data, and then use Watson AI to automatically spot patterns and anomalies buried within the data.

Spotting network problems was easier before software-defined networking (Image courtesy IBM)

“As we surround applications with our capabilities, we will understand the traffic flow and the performance and what’s normal,” Coward says. “The longer you run the AI within the network, the more you know about what typically happens on a Tuesday afternoon in Seattle.”

A key aspect of SevOne is the ability to take raw network performance data from sources–such as SNMP traps, logs in Syslog formats, and even packets captured from network taps – combine it in a database, and then generate actionable insights from that blended data.

“The uniqueness of SevOne is really that we put it into a time-series database. So we understand for all those different events, how are they captured [and] we can correlate them,” Coward explains “That sounds like an extraordinary simple things to do. When you’re trying to do that at scale across a wide network where you literally have petabytes of data being created, it creates its own challenge.”

The insights generated from SevOne can take the form of dashboards that anyone can view to see if there’s a network problem, thereby eliminating the need to call IT. The AI also helps with providing clarity into management events that can be automated with the software.

The offering is primarily designed for customers with large networks, such as enterprises with significant customer-facing properties on the Web and telecommunication providers, including those rolling out 5G networks, Coward says. Customers will be able to monitor and manage their 5G networks in the same way they monitor and mange their wired Ethernet and WiFi networks, he says.

IBM is targeting 5G deployments of IoT applications with its SevOne NPM (Source: GettyImages)

Network management for general office IT is considered to be a solved problem, Coward says. Bigger problems that demand more rigor could include things managing the 5G network for intelligent IoT devices, such as smart cameras designed to spot manufacturing flaws. IBM is also working with Boston Dynamics to help monitor and manage the network for its robotic offerings, including a robot dog that can switch between sensing heat in equipment during the day and roving the factory at night to spot fires and possible victims. Those two use cases have different network demands.

“The excitement for us is for 5G to get deployed for enterprises in a meaningful way,” Coward says. “There’s always the debate: Can you do things with compute to save bandwidth, or just throw bandwidth at the problem. With 5G [the conventional wisdom is that] we’re just throwing bandwidth at the problem, so we don’t need any of this technology anymore. That’s not really true.”



About the author: Alex Woodie

Alex Woodie has written about IT as a technology journalist for more than a decade. He brings extensive experience from the IBM midrange marketplace, including topics such as servers, ERP applications, programming, databases, security, high availability, storage, business intelligence, cloud, and mobile enablement. He resides in the San Diego area.