IBM Watson Targets AI’s Data Snags
IBM continues to ramp up its Watson AI offerings as adoption of automation technology remains limited, in part due to what company executives say is a lack of data integrity, trust and the skills needed to implement AI platforms.
With boosting enterprise adoption in mind, IBM executives pitched a batch of Watson implementation tools built around open-source machine learning frameworks, pre-built applications or the ability to create new ones. Others would allow users to embed AI into third-party platforms. Long term, the company is betting that ease-of-use will accelerate adoption of AI technology, which according to optimistic estimates will add upwards of $16 trillion in global GDP by 2030.
Noting the slow corporate uptake of AI—adoption estimates range from 4 to 10 percent of companies surveyed—IBM officials speaking at a company event in Miami on Tuesday (Oct. 22) said data for model training and analytics remains critical. “Your AI is only as good as your data,” said Rob Thomas, general manager of IBM’s Data and AI unit.
(While adoption rates are being slowed by operational woes, IBM executives also note that that current estimates don’t reflect the growing number of data science projects within most enterprises.)
Meanwhile, IBM and others are striving to provide the necessary infrastructure that cleanses, preps and makes data accessible while reducing its movement within hybrid deployments. “There’s no AI without IA,” Thomas adding, citing a frequent reference to the need for agile information architectures built on sound data sources.
As requirements for training data continue to grow, IBM’s IA push centers on reducing data movement, as illustrated by its Cloud Pak hybrid data offering. The data analytics platform is billed as a way to link large data stores to AI applications, pulling from different databases and using virtualization.
As it completes the integration of its Red Hat acquisition, IBM also announced this week that Cloud Pak runs on OpenShift, Red Hat’s Kubernetes-based container orchestration platform.
Those and other efforts are aimed at making data more accessible to developers struggling to move AI apps to production. IBM executives note that enterprise AI efforts often fail because many data scientists and application developers have it backwards. “Bring AI to your data, not the other way around,” Daniel Hernandez, an IBM vice president for data and AI, said in an interview.
Overall, IBM’s renewed Watson initiative aims to automate the very processes required to develop enterprise automation applications. For example, Cloud Pak includes an AutoAI feature designed to automate tedious but essential machine learning tasks such as data preparation, model selection and optimization.
AI adoption has lagged in part due to the inability to operationalize models and move them to production. Among the reasons is a lack of clean, labeled or “shaped” data, said Rohan Vaidyanathan, IBM’s program director for its Watson OpenScale initiative. That effort seeks to scale AI adoption through new capabilities such as “drift detection” that alerts developers when models stray from a developer-defined threshold.
The goal, Vaidyanathan said, is the elimination model bias, data integrity and, ultimately, “labeled ground truth.”
Thomas, IBM’s AI general manager, stressed the massive data preparation and discovery steps that stymie many projects. Hence, IBM is embracing a plug-and-play approach to AI development.
“We see do a greater adoption rates when people can just consume AI, and it doesn’t put the burden on them to figure out the data or to figure out the skills,” Thomas said. “That really helps adoption when you make it that simple for people to use.”