Advanced Computing in the Age of AI | Thursday, March 28, 2024

Intel to Put Its AI chips Head-to-head in Cloud 

Intel will provide a one-stop shop for customers to measure the performance of their AI applications on their different chips.

The company later this year will arm its Intel Developer Cloud service with the ability to run head-to-head comparisons of machine-learning tasks on GPUs, Xeon CPUs and other accelerators like Gaudi2.

"Customers can find out what the benefits are of their particular model on a given set of hardware," said Ronak Singhal, senior fellow at Intel.

The goal is to give coders the ability to test their codebase on the different instances and select the right chip to meet their performance requirements. Intel will use assets it got from the 2020 acquisition of Israeli company Cnvrg.io to build that platform, Singhal said.

Cnvrg.io offers tools and blueprints that allows customers to conjure up AI applications in a snap. The containerized models can leverage a wide range of existing compute resources, including FPGAs, GPUs, AI chips and CPUs to run AI applications. In this case, the company is prototyping the AI models on those chips, which will then be presented to customers to select the best hardware to run their models.

Intel's Developer Cloud is a platform on which customers can prototype and tune AI applications to the company's chips. Companies can then then deploy those models to cloud services that offer virtual machine instances based on Xeon chips. All the major cloud providers are expected to offer virtual machine instances running on Sapphire Rapids chips.

Screenshot of Intel Developer Cloud beta portal

The market is now flooded with chips for AI, which have their own advantages for AI models. For example, computer vision applications are better on GPUs than AI processors or CPUs.

"Of course, everything will work. But some things will be faster than others on specific hardware. Customers can find out what the benefits are of their particular model on a given a set of hardware," Singhal said.

Google provides free access to hardware resources in its cloud services through Colab, which is a playground for researchers to test out neural network concepts. Developers can write a Python script to compile AI instructions (including the model type), and also specify the preference of using  a GPUs or TPU in the cloud. The cloud service automatically selects the available GPUs, which are mostly old GPUs like Nvidia’s Tesla T4, on which it runs the AI processing. The older GPUs slow down the inferencing, but customers can pay a monthly subscription or annual fee for access to the latest and greatest Nvidia GPUs. But there’s no way to run neck-to-neck comparisons of GPU performance.

The Intel Developer Cloud service, which is still in beta, has sections dedicated to hardware and software. The bare-metal service allows customers to test out the instances on 4th Gen Intel Xeon chips (codenamed Sapphire Rapids), Gaudi2 and Xeon Max chips. Customers can select an instance and are charged by the hour, which mirrors the way AWS, Google Cloud and Microsoft Azure bill customers.

A tiny virtual machine instance with four CPU cores, 10GB of storage and 8GB of RAM on Sapphire Rapids is free. The largest Sapphire Rapids instance with 32 CPU cores, 64GB memory and 64GB of storage costs $4 per hour. A Xeon Max instance with a GPU – the chip codenamed Ponte Vecchio – costs $18 per hour. An instance of the Gaudi2 AI chip costs $39 per hour. All the virtual machines host Ubuntu 22 as the guest OS.

The Dev Cloud service also has an online developer environment which hosts the OneAPI tools for AI, high-performance computing, rendering and other applications. The OneAPI AI Analytics Toolkit has separate packages for frameworks that include TensorFlow and PyTorch.

EnterpriseAI