Advanced Computing in the Age of AI | Wednesday, April 24, 2024

AWS Working to Make Machine Learning Accessible to Non-Developers, While Growing ML Worker Ranks 

As a serious shortage of enterprise AI and ML developers continues across a wide range of businesses, Amazon Web Services (AWS) is looking to jumpstart AI and ML education to help fill the immediate gaps and inspire students and others to join the lucrative and in-demand field.

To accomplish these goals, AWS has announced two new members of its SageMaker machine learning training platform family – Amazon SageMaker Canvas, which will allow non-developers to create no-code ML projects – and a free, public preview of its all-new Amazon SageMaker Studio Lab, which will allow anyone interested in learning more about ML to experiment with the technology without even needing an AWS account.

Both offerings, which were unveiled this week at the AWS re:Invent 2021 conference in Las Vegas, are available immediately to help enterprises work to solve some of their ML project backlogs, Bratin Saha, the vice president of AI and ML at Amazon, told EnterpriseAI.

Bratin Saha of AWS

AWS already makes it easier for ML professionals to do their work using tools such as SageMaker, said Saha, so the pivot to using related tools to bring no-code ML to non-developers or so-called “citizen developers” is an offshoot of that mission, said Saha. No-code applications and services are already available for use in a wide range of IT niches, but in the world of ML there have been no such offerings until now, he said.

“With our innovations, we think machine learning will follow a similar path, and customers have been telling us that giving them no-code tools would let their analysts and others start using machine learning,” said Saha.

Introducing Amazon SageMaker Canvas

The SageMaker Canvas tools do even more than just let non-developers enter their projects, he said – the application will also perform inferencing for non-pro users by filling in missing values and correcting errors after extrapolating information from the remaining existing data.

“There is a lot more ML intelligence added into these tools at the back end to help the user,” said Saha. “Part of it is there are algorithms that say, here is what the data set looks like and that certain columns or cells are still empty. Based on the remaining entries in the column, it determines what it should look like.”

It is like solving a math problem for X, he explained. “But there is also other error checking, for example if you have a field for a Zip code, you know that if the numbers have eight digits that they are not valid U.S. Zip codes,” he said. “Then they can go in and flag those errors.”

If more work is needed on an ML project created in SageMaker Canvas, the Canvas project can be viewed and changed within the SageMaker Studio version for developers as required since Canvas is based on SageMaker Studio, said Saha.

“SageMaker Canvas allows you to export everything that is being done in the low-code way on to SageMaker Studio,” he said. “That code becomes available, as well as all of your data preparation and your model building code.”

Amazon SageMaker Canvas being introduced at re:Invent 2021 by AWS CEO Adam Selipsky.

Customers who have been working with an early version of Canvas have told AWS that this easy data importing feature is a valuable trait, said Saha. “That is one of the novelties we have, where you can seamlessly transition from a no-code environment to a code-first environment and … then get a data scientist involved to make modifications as they see fit.”

SageMaker Canvas allows non-data scientists to create and run their own ML models using data from disparate data sources in the cloud or on-premises, while combining datasets with the click of a button, according to the company. Those employees can use Canvas to train accurate models and then generate new predictions once new data is available using an intuitive interface.

The service is now generally available in AWS regions in the US East (Ohio), US East (N. Virginia), US West (Oregon), Europe (Frankfurt) and Europe (Ireland). It can be used with local datasets, as well as with data already stored on Amazon S3, Amazon Redshift or Snowflake, according to AWS.

Introducing Amazon SageMaker Studio Lab

In his leadership session presentation on AI and ML at re:Invent this week, Saha spoke about how the global demand for machine learning practitioners continues to grow much faster than the available pool of trained and skilled data scientists who can fill those roles. To help reduce that gap, AWS introduced SageMaker Canvas, but going a step farther, it is also introducing SageMaker Studio Lab, which aims to bring ML education to many more people who can gain the specialized skills that are needed to do this important and lucrative work, he said.

“It is a completely free service for students, experimenters, researchers and others to get started with machine learning, learn about machine learning, do quick experiments with machine learning and more,” said Saha. “And you do not even need an AWS account to get started. You can just use an email address to log in … and it gives you not only free compute, but it also gives you free storage. Studio Lab does all this for you and it comes integrated with GitHub and all the popular software packages.”

Also included are valuable educational materials including the open source ML guidebook, Dive into Deep Learning from D2L that provides a wide range of in-depth information and lessons on ML, AI, neural networks and much more, he said. About $10 million in AI and ML scholarships provided by AWS, Udacity and Intel Corp. are also being offered to help underrepresented and underprivileged students get an AI/ML education through Studio Lab, said Saha.

“This is a first of its kind project from AWS,” said Saha. “It shows how machine learning has become more important and that our customers keep telling us they want to see how we can help grow the skills of more people. It also shows AWS' commitment to increasing the amount of education resources that we provide.”

Studio Lab’s educational content also includes the AWS Machine Learning University, which provides access to the same ML courses used to train Amazon’s own developers on ML. Other resources include Hugging Face, which is a large open source community and a hub for pre-trained deep learning (DL) models that are aimed at natural language processing.

Rob Enderle, principal analyst with Enderle Group, told EnterpriseAI that AWS’s moves to make ML easier for non-developer users and to potentially bring more workers into the field are important for the future of AWS and other ML vendors.

Rob Enderle, analyst

“With any new technology there is a learning curve and services like AWS require massive numbers of users before the related effort become financially material,” said Enderle. “Initially ML was simply too difficult and relatively restrictive for this to be practical but as this capability moved into broad use understanding increased and the ability to create more easily learned tools became viable.”

Those initial efforts are now becoming a race to grow those numbers and AWS does not want to lose that race, said Enderle. “As a result, much of the early focus is on making the path to learning attractive and easy and that is what AWS is doing here – they are first working on rapidly building a foundation of users before some other tool or service can capture the coming larger wave of users. Microsoft, who at its core is a tools and platform company, likely represents the biggest threat domestically to AWS.”

With those threats in mind, AWS is “taking the extra step to make this training free, showcasing an unusual ability to sacrifice tactical short-term revenue in exchange for far larger long-term strategic advantage and profits,” said Enderle.

James Kobielus, analyst

Another analyst, James Kobielus, the senior research director for data communications and management at TDWI, a data analytics consultancy, agreed that Amazon SageMaker Canvas will be welcome addition for users. "What I like most about it is how nicely it converges MLOps and DataOps functionality into a very capable tool for the global enterprise market," he said. "The offering helps statistically astute non-data-scientists, such as business analysts, to automate data integration and cleansing, while automating the creation, training and evaluation of hundreds of ML models against that data. For any enterprise that keeps its cloud data in S3, Amazon Redshift, or Snowflake, and that has a short-staffed data science capability – which is most enterprises – this new offering will be highly attractive."

Similarly, AWS SageMaker Studio Lab is "a great on-ramp for citizen data scientists," said Kobielus. "Recognizing that the ML workbench market can't grow unless a new breed of self-taught data scientists emerges, AWS is making the right move in doing a public preview of the free-of-charge service offering Amazon SageMaker Studio Lab. What sweetens this offering is AWS throwing in all of the following: free access to cloud compute resources, CPU and GPU, free access to open-source deep-learning models, and free access to data science educational content."
In addition, AWS eliminating the need for users to have AWS accounts, provide credit card info or to know how to configure cloud resources will also expand the user pool for these products, he said.