Advanced Computing in the Age of AI | Saturday, January 29, 2022

Data Surveillance: Because “Data is Where the Money Is” 

Infamous bank robber Willie Sutton (Source:

Infamous felon Willie Sutton robbed his first bank in 1927 and went on to make illegitimate withdrawals of millions of dollars from some dozens of banks over two decades. Sutton preferred deception to fire power, infiltrating security details in a myriad of disguises – dressed as a guard, a police officer, a messenger, or window washer. He didn't bother with teller-line tills. He went for the big money in the safes. When asked by a reporter why he robbed banks, he reportedly replied, "Because that’s where the money is."

Data intruders are today’s Willie Suttons. Like him, they are not interested in fleecing a single individual; they want to score big and pilfer large organizations' data repositories. In today's economy, data is the new gold, a highly valued and much sought after prized commodity.

But Companies Fail to Protect Their Data

Ironically, many institutions, companies, and organizations still relegate IT and technology as a secondary back office function, along with the corresponding investment and attention to data security. All the while, a high profile chief financial officer manages the complex fiscal functions and assets. Imagine that. This would be like banks storing all their loose change in high security vaults but entrusting their real assets to the tellers with nothing but a faulty panic button for protection.

Nowhere is this problem more evident than in high profile breaches at the likes of Target, several Blue Cross insurers, Sony Pictures, and the Internal Revenue Service. And breaches continue happening even though there are many companies – such as FireEye, Palo Alto Networks and Check Point – dedicated to cybersecurity.

Meanwhile, many enterprises still cannot answer the basic board level question: Has our data been stolen? And even those who can answer in the affirmative are often still stumped by the next question: what are we doing to protect our data?

What Can Be Done?

"Data Surveillance" could be the future to mitigate and manage data breaches. This means gaining full visibility into every session to recognize real threats the moment they occur. Think about this like the Lojack stolen vehicle recovery system. Prior to Lojack, the owner of a car did not know his car was stolen until it was already gone. With Lojack, police can track the vehicle with a 90-percent recovery rate. This is what data surveillance does for all the data in the sessions.

dataData breaches have progressed over the past two decades, evolving from data leaks (meaning unintentional misuse or mistaking losing track of data) to data floods (intentional, egregious wide scale data pilfering and looting). In short, the environment has gone from data compromise to wholesale data theft of 100 terabytes or more.

So why is data theft becoming so pervasive? One reason is organizations use technology designed only to repair data leaks to attempt to combat these data floods. One example is DRM, or digital rights management, which is clunky, complicated and expensive. Another is DLP, or data leak prevention, a technology introduced in the early 2000s which was designed for problems at that particular time which were much simpler in nature. This technology essentially uses static content analysis to identify potential mishaps (whether intentional or not) such as clicking on the wrong confidential document. In short, DLP was not meant to protect against targeted data theft.

flyingcloudSanta Cruz, Calif.-based startup Flying Cloud is taking a different approach. It protects enterprise data through the application of enterprise analytics oversight to all data traversing a network, Founder and CEO Brian Christian told me. "Companies need to be able to recognize what 'normal' looks like in their enterprise environments so they can recognize real threats when they see them," Christian said. "You can't recognize unusual deviant activity until you understand what your environment's established day-to-day activities are. Once you've grasped that, anything different immediately stands out."

Understanding Current Approaches

Let’s step back and examine the current environment of enterprise security. It is necessary to keep pace with the rapid evolution of big data, machine learning, virtualization, elastic computing, and the cloud to keep data secure.

The ideal data security ecosystem essentially consists of the following: 1.) Gateway: Application-based management policy and unified threat defense; 2.) Network: Sandboxing and dynamic behavioral analysis; 3.) Endpoints: Advanced machine learning for the endpoint; 4.) Logging: Event and intelligence automation and collaboration; and 5.) Data: Advanced machine learning and data surveillance systems.

At the outset, an organization needs to know its log files are correct and complete. The very first thing an intruder does when it compromises a remote host is cover its tracks. That is, the intruder modifies log files to remove any entry of its presence and suppresses future notices of itself from the system.

Secondly, and this is especially true in today’s world of advanced malware, the intrusion may not even appear to security staff. Many times, advanced malware take advantage of scraping memory segments and use advanced IPC hooks (or interprocess communications) then performs operations that never even show up in log files.

In fact, advanced malware that uses steganography (hiding malicious instructions or data inside a seemingly benign file) may never appear until a key (for example, a file or picture off social media) is triggered, which may seem benign until all the parts are assembled. This then activates the malicious software and is how malware bypasses an organization's sandbox. You cannot simply try to contain malware as if it’s a security vendor versus hacker game – the enterprise will inevitably lose the data security chess match.

Furthermore there is the inherent 'unreality factor.' Simply put, the real world is full of day-to-day datacenter mishaps and incidents. The logging service could die on the server. A router or switch could go down, preventing IT from receiving logs in a timely manner. The organization may be unaware it is not getting a complete log-file; how does it know the log-file hasn’t been modified either on the server or in transit to the logging server? Trying to quantify a breach or gather intelligence is extremely difficult when a company cannot be certain it is seeing the whole picture. Again, these are all real world, real time datacenter issues that can only be resolved if caught in time.

The Solution: Data Surveillance.

Data protection is more than just encryption. Encryption is fine when you know what to encrypt. But with massive amounts of data currently floating around an organization, while a company may have a rough idea of what’s important the organization still may not always know who is accessing the data or why.

This brings us back to data surveillance. It is not enough to hire guards, build a wall around your data, and attempt to identify and fend off intruders (in other words, malware). The data itself must be tracked and monitored to greatly lower the probability and potentially eliminate data theft.

FC ScreenshotWithout data surveillance, an organization is not getting a complete snapshot. How can they identify an anomaly when it has no idea of what normal is? In today’s dynamic and robust networks, and with the advantageous desire to use BYOD and mobile security coupled with the Internet of Things, today's 'normal' baseline will be different tomorrow.

Flying Cloud goes a step further with data surveillance – essentially tracking and monitoring the data itself. In addition to looking at logs, the startup does advanced, unsupervised machine learning on packet payloads, looking for variations and statistical variations on what is coming in and out of your network. Flying Cloud protects an enterprise’s data with full visibility into every session and packet. Flying Cloud provides what they call a Rolling Baseline of what normal looks like with big-data-scale storage and analytics performance so you can recognize real threats when you see them.

Cyber security is more than just chasing malware, said Ben Woo, managing director of Neuralytix, a global IT market research and consulting firm. Being able to watch and understand the data is imperative to a holistic and healthy security program.

ScottPearsonAbout the Author

R. Scott Pearson is currently a Business Development executive with Cray and also serves on the program committee for Workshop on Big Data Benchmarking. He was formerly the Director of Big Data and HPC Solutions at Brocade and has been involved in the open source and HPC communities for 20 years. Follow him on Twitter @BigDataScotty or connect on LinkedIn.  

About the author: Alison Diana

Managing editor of Enterprise Technology. I've been covering tech and business for many years, for publications such as InformationWeek, Baseline Magazine, and Florida Today. A native Brit and longtime Yankees fan, I live with my husband, daughter, and two cats on the Space Coast in Florida.

One Response to Data Surveillance: Because “Data is Where the Money Is”

  1. HPCBrad says:

    Ignorance is bliss, right? Knowing what normal looks like is a natural first step.

Add a Comment