Advanced Computing in the Age of AI | Wednesday, June 29, 2022

New Scheduling Tool Designed to Boost Microservices 

via Shutterstock

Computer scientists at the University of Michigan have come up with a faster way to schedule cloud microservices via a new algorithm running on a custom processor.

The platform, called Q-Zilla, is based on a widely used scheduling algorithm called SITA, for Size-Interval Task Assignment. The variation on the SITA algorithm classifies tasks, scheduling the simplest at the front of a server queue while holding back more computing-intensive jobs. The result is said to be better performance in prioritizing microservices workloads.

IEEE Spectrum reported the university researchers are targeting their task manager at emerging cloud microservices entering the enterprise mainstream as a way to accelerate streaming and other distributed services and applications. The platform also is touted as working on any server, the publication reported.

Running on a proprietary processor design called CoreZilla, the platform outperformed competing approaches in processing microservice workloads. The university researchers will report their findings later this month during the IEEE International Symposium on High-Performance Computer Architecture in San Diego.

An Amazon Web Services (NASDAQ: AMZN) researcher is listed as a co-author of the University of Michigan paper.

Q-Zilla was developed by a team led by Amirhossein Mirhosseini, a doctoral candidate at the University of Michigan. Mirhosseini said the research on boosting server efficiency is backed by Arm Ltd. and the Semiconductor Research Corp. Arm has patented the technology, which seeks to reduce task queuing and other microservices overhead.

Among the challenges addressed by the server efficiency effort are the “killer microseconds” problem and “tail latency.” The former reflects how the sheer speed and agility of microservices have highlighted the need to cut microseconds from the delivery of cloud services like video streaming to millions of simultaneous users.

Mirhosseini refers to the latency problem as “tail at scale,” and the researchers algorithm offers a framework for what the researchers call “tail-tolerant computing.”

About the author: George Leopold

George Leopold has written about science and technology for more than 30 years, focusing on electronics and aerospace technology. He previously served as executive editor of Electronic Engineering Times. Leopold is the author of "Calculated Risk: The Supersonic Life and Times of Gus Grissom" (Purdue University Press, 2016).

Add a Comment