Advanced Computing in the Age of AI | Tuesday, March 19, 2024

Google Cloud Goes Global with Nvidia T4 GPUs 

Nvidia’s T4 GPUs unveiled earlier this year for accelerating workloads such as AI inference and training are making their “global” debut as cloud instances on Google Cloud.

Google (NASDAQ: GOOGL) claimed Monday (April 29) it would be the first cloud provider to offer access to Nvidia T4 cloud instances across multiple regions, available now in beta. It announced T4 availability across eight regions earlier this month, making it first to offer Nvidia’s Tesla T4“ globally. (Amazon Web Services announced “G4” cloud instances based on T4 GPUs in March.)

The cloud rivals are offering T4 GPU instances based on Nvidia’s (NASDAQ: NVDA) Turing architecture as the market for datacenter-based machine learning training and inference continues to boom. Nvidia estimates that as much as 90 percent of the cost of machine learning at scale is devoted to AI inference.

Nvidia rolled out the T4 last fall with the goal of accelerating machine learning training and inference within datacenters at a lower price point. For example, the T4 includes Tensor Cores to speed training along with hardware acceleration for faster ray tracing algorithms.

Tensor Core GPUs support so-called “mixed precision” training of machine learning workloads. Cloud vendors are offering T4 instances as an alternative to the higher-end V100 GPU. If training workloads don’t utilize more powerful V100 horsepower, which is aimed at traditional HPC workloads, “the T4 offers the acceleration benefits of Turing Tensor Cores, but at a lower price,” Google noted in a blog post announcing its new T4 cloud instances. “This is great for large training workloads, especially as you scale up more resources to train faster, or to train larger models.”

The partners also pitched T4 GPU cloud instances as a way to reduce latency and boost throughput for inference models, noting that Tensor Cores with mixed precision accelerated inference by a factor of as much as ten on the ResNet-50 image classification neural network.

Google said T4 instances can be accessed for as low as $0.29 per hour per GPU, with “on-demand” instances starting at $0.95 per hour per processor. It is also offering “sustained use discounts.”

Google Cloud also offers Nvidia V100 instances.

Google Cloud’s T4 GPU availability includes three regions each in the U.S. and Asia and one each in South America and Europe. Those regions are linked by a high-speed network.

Each T4 instance includes 16 Gb of onboard GPU memory along with mixed precision, or data type, support, including FP16, FP32, INT4 and INT8.

About the author: George Leopold

George Leopold has written about science and technology for more than 30 years, focusing on electronics and aerospace technology. He previously served as executive editor of Electronic Engineering Times. Leopold is the author of "Calculated Risk: The Supersonic Life and Times of Gus Grissom" (Purdue University Press, 2016).

EnterpriseAI