Advanced Computing in the Age of AI | Thursday, March 23, 2023

IBM Project DataWorks: Joining Multi-Sourced Data for AI-based Analytics 

IBM’s aggressive push into the data analytics market continued today with the announcement of Project DataWorks, a Watson initiative that IBM said is the first cloud-based data and analytics platform to integrate all types of data and enable AI-powered decision-making.

Project DataWorks is designed to lower the complexity for business managers and data professionals to collect, organize, govern, secure and generate insight from multi-sourced, multi-format data. The goal: become what IBM calls “a cognitive business.” Project DataWorks deploys data products on the IBM Cloud using machine learning and Apache Spark while ingesting data from 50 to hundreds of Gbps and from a variety of endpoints: enterprise databases, Internet of Things, streaming, weather, and social media.

“It’s a system that will on-board data, tools, users, apps, all in a scalable and governed way,” Rob Thomas, VP of Products, IBM Analytics, told EnterpriseTech. “The purpose is simple: we are preparing all data within a company for use by AI. We’re helping people leap in to the future around AI and machine learning.”

ibm-project-dataworks-screenshotProject DataWorks is intended to overcome much of the complexity involved in implementing big data analytics. Most of the work involved in large scale analytics projects is done by data professionals working in silos with disconnected tools and data services that may be difficult to manage, integrate, and govern. IBM said Project DataWorks helps break down barriers by connecting multi-format data. Data professionals can work together on an integrated, self-service platform, sharing common datasets and models that for better governance, while iterating data projects and products, with less time spent on finding and preparing data for analysis.

IBM's Rob Thomas

IBM's Rob Thomas

Available on Bluemix, IBM’s Cloud platform, Project DataWorks is “built entirely on Open Source,” Thomas said, “so clients can have access to all the innovation that’s in the Open Source community, but not deal with the headaches of trying to integrate those pieces.” IBM said Project Dataworks leverages an open ecosystem of more than 20 partners and technologies, such as Confluent, Continuum Analytics, Galvanize, Alation, NumFOCUS, RStudio and Skymind.

IBM also announced a list of customers using Project DataWorks, including Dimagi, KollaCode LLC, nViso, Quetzal, RSG Media, Runkeeper, and TabTor Math.

RSG Media, which delivers analytical software and services to media and entertainment companies, is uses Project DataWorks to perform analytics across a large volumes of first- and third-party data sets. These include monitoring cross-platform content and advertising viewership, and identifying individual viewing behaviors while cross-analyzing demographic, lifestyle and social insights. RSG Media helps clients gain insights on audience preferences and develop programming schedules. According to the company, in one scenario, this resulted in a lift of $50 million to a single network’s bottom line.

“We realized that we needed more than just a cloud infrastructure provider. We needed a partner to help us manage data on an unprecedented scale, and empower our clients to turn that data into insight,” said Mukesh Sehgal, founder and chief executive officer, RSG Media. “IBM is the only cloud vendor who offers an integrated set of capabilities for building advanced analytics applications that would allow us to quickly and cost-effectively bring new offerings to market."

IBM also announced the DataFirst Method, a methodology that enables organizations to assess the skills and roadmap that instructs organization how to progress in their use of data, including practices and methods to help clients transform their processes for data discovery, handling and analytics.

Add a Comment