Advanced Computing in the Age of AI | Thursday, May 2, 2024

Groq First to Announce Performance Advantage Results with STAC-ML Markets Benchmark 

MOUNTAIN VIEW, Calif., Nov. 1, 2022 -- Today, the Securities Technology Analysis Center (STAC) published audited benchmarking results from Groq for the financial industry, showcasing ultra-low latency, especially at low batch sizes such as batch 1. Over the last few years, the financial services industry has been asking vendors to show performance numbers on market-specific workloads. Amongst the compute incumbents in the space, Groq is the first to announce STAC-ML audited benchmark results, implemented on a GroqNode server that includes eight GroqCard 1 Accelerators and two AMD EPYC 7413 processors.

The results showed extremely low long short-term memory (LSTM) latencies, such as LSTM_A (small model size) with a median latency of 0.054ms per inference when running one model instance, and GroqNode throughput of 471,585 inferences per second when running eight model instances in parallel. These repeatable results, possible with Groq's deterministic compute, demonstrate unparalleled performance regarding both latency and throughput. The impact for the financial services industry is more accurate pricing predictions for real-time trading and risk analysis using machine learning (ML) models.

Amr El-Ashmawi, VP of Vertical Markets at Groq, commented, "There are many factors that can fluctuate the pricing of equities. News feeds bearing good or bad news about the current or future prospects of the market is arguably one of the most influential factors driving daily price fluctuations. Using ML models such as RNN/LSTMs can help forecast equity pricing, reducing portfolio risk."

The results also demonstrate that when it comes to LSTMs, an example of time series AI model where past data is as significant as current data, there is no need for developers to compromise model functionality nor complexity to get performance. Groq demonstrated the simplicity of porting STAC's models using a software-first generalized compiler approach to the development environment. With just a single line of code, developers can port numerous existing PyTorch or TensorFlow models, dramatically simplifying and accelerating the ML development process.

Peter Nabicht, President of STAC, commented, "STAC benchmark tests are specified by customers based on their business needs. Financial firms in the STAC Benchmark Council collaborated on this version of STAC-ML because low latency inference is a growing necessity, often in environments with constrained power and space. We thank Groq for their leadership in being the first vendor to provide the industry with needed data on performance, quality, and efficiency."

Come meet with Groq at the STAC Summit in London on November 10 at the Leonardo Royal Hotel London City and at Supercomputing 22 in Dallas from November 13-18.

About Groq

Headquartered in Mountain View, CA with geo-agnostic teams across the US, Canada, and the UK, Groq's innovative deterministic single-core Tensor Streaming Processor architecture lays the foundation for its compiler's unique ability to predict exactly the performance and compute time of workloads while delivering uncompromised low latency. Groq has raised $367 million, with Series C funding co-led by D1 Capital and Tiger Global Management. For more information, visit www.groq.com.


Source: Groq

EnterpriseAI