Advanced Computing in the Age of AI | Friday, April 19, 2024

HPE ML Ops: Containerized Software for Machine Learning Operationalization 

Gartner’s 2019 CIO Survey found that the number of enterprises implementing AI grew 270 percent in the past four years. That’s all impressive, well and good – but how many of those projects have evolved past the POC stage, how many have gone into production and been scaled across the enterprise?

Ay, there’s the rub. AI implementations are up, but Gartner also reported last October that by 2021, more than half of machine learning projects will not be fully deployed because of operational problems. That’s the key challenge in enterprise AI today: getting past the project phase, what IBM calls “chapter 2” for AI. Last November, HPE acquired BlueData, maker of container-based software for AI deployment and management. Today, the company announced HPE Machine Learning Ops, a containerized array of software and tools designed to speed AI time-to-value and to bring DevOps agility to the ML model lifecycle.

Running on-premises, in public clouds and in hybrid cloud environments, HPE ML Ops offers standardized ML workflows across the development lifecycle to operationalize ML – what Gartner calls the “last mile” for successful deployment and management of ML models.

HPE said ML Ops is designed to address all aspects of machine learning development, from data preparation and model building, to training, deployment, monitoring, and collaboration.  A hoped-for benefit of HPE ML Ops is to overcome “ivory tower,” lab-ensconced experimentation of data scientists, whose work can be somewhat out of touch with the more pedestrian considerations of machine learning implementations.

“Data scientists, they’re not operations people,” Jason Schroedl, VP Marketing, BlueData HPE, told us. “They don’t understand what it takes to deploy models into production in an enterprise environment, to get all the systems working, make sure it meets all the security requirements, that it’s run on the right infrastructure, that it has access to the GPUs it needs for training. Data scientists…write algorithms, they build models, that’s the stuff they’re really good at, and you don’t want them to spend time on operations. It’s a team sport, you need multiple different players, multiple different users involved, from the data engineers to the  data scientists and analysts, to the machine learning developers and architects all the way through to the DevOps and operations teams.”

A new model “may work the first time as an artisan, hand-crafted solution,” Schroedl said, “but how do you do this at scale, when you’ve got different data science teams and use cases and projects they’ve got underway, to operationalize these models and make sure they’re in production?”

Available now, HPE said ML Ops is comprised of:

  • Model Build: Pre-packaged, self-service sandbox environments for ML tools and data science notebooks
  • Model Training: Scalable training environments with secure access to data
  • Model Deployment: Deployment with reproducibility
  • Model Monitoring: End-to-end visibility across the ML model lifecycle
  • Collaboration: Enable CI/CD (continuous integration, continuous delivery and deployment) workflows with code, model, and project repositories
  • Security and Control: Secure multi-tenancy with integration to enterprise authentication mechanisms
  • Hybrid Deployment: Support for on-premises, public cloud or hybrid cloud

HPE ML Ops works with open source machine learning and deep learning frameworks, including Keras, MXNet, PyTorch, and TensorFlow, along with commercial machine learning applications from HPE ecosystem software partners, such as Dataiku and H2O.ai, according to the company.

EnterpriseAI