IBM Extends Spark Push With Analytics Tools
IBM is expanding its Bluemix cloud development platform with a heavy dose of data analytics tools designed to integrate data services into a growing number of hybrid cloud deployments.
The new services that include a open source graph database service also reflect IBM's investment in other open source analytics tools like Apache Spark.
Along with new graph database and predictive analytics services, IBM (NYSE: IBM) on Thursday (Feb. 4) also rolled out a platform called Compose Enterprise. The managed platform targets development teams building web-scale distributed applications and allows them to deploy open-source databases on dedicated cloud servers.
Citing the growing role of data and analytics in enterprises, IBM said its cloud data and analytics marketplace is intended as a "one-stop shop" to give data scientists and application developers access to more than 25 cloud data services and 150 public datasets. The dataset catalog is part of a separate "analytics exchange" also released on Bluemix. The company said the datasets can be used separately for analytics or integrated into business enterprise applications.
Meanwhile, the new graph database service is built on Apache TinkerPop, the open source graph-computing framework targeting both graph databases and graph analytics. IBM's twist is providing Bluemix users with a full stack to develop analytics applications like network monitoring, fraud detection, real-time insights and Internet of Things deployments.
The graph service is intended to make it easier to move data from existing databases to graph architectures. IBM is credited with moving the TinkerPop stack into the open source community.
The new cloud-based predictive analytics service would enable developers to select from a library of machine-learning models for use in analytics applications. Accelerating the shift toward self-service analytics in the enterprise, the marketplace also seeks to make it easier to leverage predictive analytics for specific use cases without the help of data scientists.
While allowing developers to build and deploy web- and cloud-based applications, IBM stressed the new cloud data services also would allow data scientists to dig deeper to "apply trusted information across businesses…."
The company noted that its new cloud data and analytics offerings build on its investment in Apache Spark. IBM has so far used Spark to redesign more than 15 analytics and e-commerce platforms as it extends its reach into the Apache Spark ecosystem.
For example, IBM's machine learning technology, SystemML, originally developed for its BigInsights data analytics platform was accepted last November as an Apache Incubator open source project. SystemML is a machine learning algorithm translator designed to help developers building machine-learning models used for predictive analytics across a range of industries.
The open-source version of SystemML is intended to help data scientists transfer their algorithms to production environments without the need for rewriting the entire code base. That, the company claimed, enables the ability to scale data analysis from a laptop to large clusters.