Advanced Computing in the Age of AI | Saturday, April 20, 2024

SAP Fills Out Its Hadoop Dance Card 

Recently, SAP has begun a fresh push to beef up its big data chops, including the addition of new partnerships to distribute two different flavors of Apache Hadoop to their enterprise customers.

In filling out their Hadoop dance card, SAP’s has decided to attach themselves to both Intel, who announced that they’d be offering their own version of Apache Hadoop in February of this year, and Hortonworks, a Yahoo! spinoff – where Hadoop first saw incubation before being released into the open source wild.

In examining SAP’s choices, it could be said that they are relatively conservative. While Intel is the newest kid on the Hadoop block, the company itself is no stranger to enterprise customers, most of which have built their legacy infrastructure on Intel’s shoulders. It’s no big surprise that SAP would trust Intel’s implementation of the rising Hadoop framework given Intel’s resources, legacy, and road map for the framework.

The SAP decision to move forward with Hortonworks as a second Hadoop distro partner is interesting and can be seen as another conservative choice. While other vendors, such as Cloudera and MapR might have been worthy candidates for SAP, Hortonworks distributes only the most recent stable version of vanilla Apache Hadoop. Where these other vendors add very nice bells and whistles to their Hadoop canvasses, Hortonworks extends itself around the stuff that anybody can download straight from the Apache website – a move which, right or wrong, eases a lot of minds where vendor lock-in is a concern.

The results of their choices gives SAP the option to offer two versions of Hadoop to their HANA customers – the vanilla version, or the one with Intel sprinkles on it. And indeed, Intel has added quite a bit in the short time that they’ve been in the game.

While every vendor is looking for ways to increase Hive and MapReduce queries, and the overall responsiveness of Hadoop, one of the interesting things that Intel has done in the last six months is to shuttle the HDFS file system that traditional Hadoop installations use, and replace it with their own global parallel file system, Lustre. Lustre has roots in high performance computing where it’s already been put to the test in multi-petabyte applications, giving Intel solid footing as it markets its Hadoop distro as “enterprise ready,” – a claim, of course, which they all make.

“As SAP tried to look into big data, what we were looking for was a partner that we could go into our customer base with that could help make Hadoop an enterprise capable feature – not just a startup feature,” says Robert Clopp, a Database CTO with SAP in a recent interview (see embedded video).

Late last month, Oracle also joined in giving their endorsement to Intel’s Hadoop distro, announcing that they are in the process of certifying it on Oracle’s Big Data Connectors.

EnterpriseAI