Advanced Computing in the Age of AI | Thursday, March 28, 2024

MapR Rolls Hadoop Migration ‘Easy Button’ 

(Source: Shutterstock/Ton Snoei)

MapR Technologies Inc. rolled out a migration service for its Hadoop distribution that targets what it says is growing demand for moving big data production installations to its converged data platform.

San Jose-based MapR said this week its quick migration service continues to run an existing Hadoop distribution while transitioning users to its converged data platform. The migration service also seeks to leverage a new appliance announced by Cisco Systems (NASDAQ: CSCO) running on SAP HANA (NYSE: SAP) that integrates the MapR converged platform. Cisco’s UCS Integrated Infrastructure for SAP HANA that incorporates the MapR platform is based on the B460 scale-out platform with C240 storage servers.

MapR said the migration service essentially creates an "easy button" for moving to the MapR Hadoop distribution as enterprises shift mission-critical applications to converged platforms. The service is touted as a way for enterprises to move their data with minimal impact on real-time applications and workflows.

The database specialist also pointed to several use cases involving an unnamed financial institution and a government agency in which competing Hadoop distributions were shifted to its converged data platform. The bank is now running multiple real-time applications on fewer nodes while the government agency is using the MapR platform for real-time mirroring across distributed datacenters, the company said.

The data platform cluster is installed and configured based on a user's existing IT infrastructure. Use cases and data are then moved to the MapR cluster while the existing Hadoop distribution continues to operate, preventing downtime or data loss.

The first step in the transition to the new data platform is data ingestion that includes identifying multiple data sources and file formats. These data are then loaded into data structures designed to facilitate data analytics. Application development includes building Java MapReduce jobs, implementing in Apache Pig, the platform for creating MapReduce programs used with Hadoop.

Along with building distributed indexes, the migration service also constructs Apache Hive, or HQL, queries as well as building new data models.

Once data is migrated to the MapR converged platform, the company said it dispatches data scientists to determine specific use cases, business priorities as well as existing workflows and data sources available for big data analytics. After raw data is moved over to the platform and restructured, the service includes a creation of a custom data model based on specific use cases.

The service also illustrates how big data vendors are striving to make data analytics tools like Hadoop and Apache Spark more widely available to enterprises as in-house analysts become more proficient using data analytics tools. The MapR initiative is part of a larger effort to move enterprise data closer to compute, storage and other IT resources to facilitate big data analytics on converged platforms.

 

About the author: George Leopold

George Leopold has written about science and technology for more than 30 years, focusing on electronics and aerospace technology. He previously served as executive editor of Electronic Engineering Times. Leopold is the author of "Calculated Risk: The Supersonic Life and Times of Gus Grissom" (Purdue University Press, 2016).

EnterpriseAI