Advanced Computing in the Age of AI | Tuesday, May 21, 2024

As Object Storage Booms, Analytics Issues Emerge 

via Shutterstock

With object-based storage capacity predicted to grow at double-digit annual rates over the next several years, attention is turning to shortcomings such as barriers to data visibility and difficulties in performing analytics using object storage.

A new vendor survey finds that “object storage has gone mainstream,” with 72 percent of respondents using Amazon Web Services’ Simple Cloud Storage Service (S3). As AWS object storage explodes, use cases are shifting toward analytics and data lakes.

Nevertheless, Chaos Sumo, an object storage startup focused on cloud-based analytics and log data retention services, found that the transition to object storage creates another set of problems, including inconsistent predictive analytics. Overall, the tool vendor found that analytics and visibility barriers are critical to those planning to use object storage as a platform for business analytics.

While the survey found that the majority of S3 customers use it as “a cheap alternative to on-premises storage” for backup and archiving data, object storage is also widely used for application and media hosting, along with 32 percent of respondents who said they use object storage for business analytics.

Greater adoption of object storage for data lakes and expanding use cases have uncovered shortfalls, such as visibility into stored data, consistent analytics performance and the growing cost of moving large data volumes. Just over one-quarter of respondents said moving data in order to analyze it was their biggest challenge in managing S3 object storage.

The “increasing costs of storing data for real- or near-time analysis is the core impediment to doing more with the growing amount of data stored in object storage,” said Thomas Hazel, founder and CTO of Chaos Sumo, Somerville, Mass.

Hence, the savings provided by object storage platforms like S3 may be offset by rising computing and networking costs, according to Chaos Sumo’s survey of more than 120 data scientists, analytics and DevOps/IT managers released this week.

While about half of those surveyed said they are using Amazon Redshift data warehouse management along with S3, 42 percent said they are using home-grown tools to overcome data analytics and visibility issues. “These tools are not only inadequate at addressing the jobs needed to be done, they also take a lot of time to set up and manage,” they survey found. For example, 52 percent of respondents said it took them more than three months to build their current analytics architecture.

The survey conducted between December 2017 and January 2018 also found that one-third of those polled are using object storage to streamline their data lakes for applications like machine learning and historical trend analysis.

The shift to object storage also has become the latest front in the ongoing public cloud price wars as AWS (NASDAQ: AMZN) seeks to maintain its sizeable market share lead. Meanwhile, chief competitors Microsoft (NASDAQ: MSFT), Google (NASDAQ: GOOGL) and IBM (NYSE: IBM) look to differentiate their services. For example, market tracker 451 Research found last year that leading public cloud vendors were cutting their object storage prices in order to compete with AWS.

Last year, Boston-based Wasabi launched AWS-compatible object storage technology that it says costs 80 percent less than S3 while performing at 6X the speed. Last month, Wasabi took on the data movement issue by announcing a pricing plan with unlimited free egress, eliminating all charges except the basic charge for actual storage, according to the company.

"Egress charges have been one of the biggest inhibitors to enterprises moving their data to the cloud," the company said in a prepared announcement. "Customers dislike egress fees because they make it impossible to accurately predict how much they are going to spend, and inevitably creates vendor lock-in... Wasabi’s vision is to make cloud storage a simple one-size-fits-all utility, like electricity or bandwidth, and that billing should be as simple and transparent as possible."

With cloud-based data analytics applications growing along with adoption of data lakes, market analysts and vendors such as VMware (NYSE: VMW) note that databases have emerged as a top cloud workload. Relational databases are expected to be the “next competitive front” in the ongoing public cloud price wars.

Hence, the need for tools for real-time analytics beyond RedShift and Amazon Athena will be needed, a market that startups such as Chaos Sumo are now targeting.

About the author: George Leopold

George Leopold has written about science and technology for more than 30 years, focusing on electronics and aerospace technology. He previously served as executive editor of Electronic Engineering Times. Leopold is the author of "Calculated Risk: The Supersonic Life and Times of Gus Grissom" (Purdue University Press, 2016).