Advanced Computing in the Age of AI | Friday, June 9, 2023

Machine Learning and Online Security in 2017 

(Pasko Maksim/Shutterstock)

As companies increase their digital footprints, ‘identify and diagnose’ capabilities will not defend against the growing array of security threats, according to analysts at Gartner Group. Because the types of data ingested by analytics packages are evolving from structured to hybrid data–containing text, objects and other formats– the market will respond to that transition by offering packaged applications that utilize more powerful predictive and prescriptive analytics.

Machine Learning (ML) and Artificial Intelligence (AI) (I use these terms interchangeably) continue to be hotly debated in security circles. The pessimists believe hackers will always outmaneuver ML, while the believers view AI as an essential companion to finding and displaying threat patterns in a complex, cloud-enhanced IT environment. While both sides have merit, the market itself is moving ahead with real-life ML applications in 2017.

Here are three of the most powerful.

Data Curation

When the International Institute for Analytics (IIA) released its predictions for analytics trends in 2016, data curation was included as an important tool in the management of data throughout its lifecycle. According to the IIA, rather than approach data management from a centralized, top-down approach, “the new data analytic tools work from the bottom up, leveraging machine learning to curate and clean the data.” As data volumes and sources continue to grow, cleaning data for threat analysis will become a costly and time-consuming process. Data curation not only makes data readily retrievable for future research purposes or reuse, it also ensures enterprise compliance.

Behavioral Biometrics

Every time we turn on our PCs, our keystrokes, mouse movements and web browsing habits act as digital fingerprints. When it comes to interactions in healthcare, law and other data-sensitive sectors, such as financial services, which spends approximately $80 billion a year on fraud prevention, behavioral biometrics can be used to analyze end user fingerprints to identify who’s on the other end and what they’re trying to do.

Combining behavioral biometrics and ML can significantly cut costs. ML algorithms identify critical fraud-related data points while learning the signifiers of good users, reducing false-positive results. Because behavioral biometrics is a collection of data, instead of a single data point, it is extremely difficult for fraudsters to mimic. Behavioral biometrics also makes it easy to distinguish human from non-human behavior. Heavily repeated patterns, such as multiple “users” all acting in exactly the same way, is a conspicuous red flag of a programmed threat.

Sift, Sort and Secure Sprawled Data

CSOs continually track how files are shared and with whom, auditing what gets used and what to archive. Because content is so sprawled across networks—one department might use Box, the other a custom integration of SharePoint and a third operate purely in-house—audits take an inordinate amount of effort. In the past, when data was more structured and the scope was limited, it was easy to define the rules. But with today’s massive amounts of unstructured data from multiple sources, rule writing is simply no longer practical or effective. ML can help 'discover' the rules by noticing patterns in how data is used.

When dealing with unstructured, structured and semi-structured data inputs, ML is a convenient, scalable and cost effective solution that eliminates the manual task of repeatedly classifying and tweaking rules. ML provides the ability to measure effectiveness and improve it by scientifically changing algorithms or algorithm parameters, making it easier to iterate and get better predictions than a rules-based system.

There are a number of ways to prepare for a successful data-driven security strategy. These include looking at data’s relationship to company problems and opportunities, and working with CIOs to prioritize IT requirements. Another is to utilize hypothesis-led modeling to generate faster outcomes and develop more practical data relationships that are understood by managers. Finally, developing business-relevant analytics that complement existing decision processes allows managers to coordinate their actions with broader company goals.

Kris Lahiri is chief security officer at Egnyte.