Advanced Computing in the Age of AI | Monday, June 24, 2024

How AI and ML Applications Will Benefit from Vector Processing 

As expected, artificial intelligence (AI) and machine learning (ML) applications are already having an impact on society. Many industries that we tap into daily—such as banking, financial services and insurance (BFSI), and digitized health care—can benefit from AI and ML applications to help them optimize mission-critical operations and execute functions in real time.

The BFSI sector is an early adopter of AI and ML capabilities. Natural language processing (NLP) is being implemented for personal identifiable information (PII) privacy compliance, chatbots and sentiment analysis; for example, mining social media data for underwriting and credit scoring, as well as investment research. Predictive analytics assess which assets will yield the highest returns. Other AI and ML applications include digitizing paper documents and searching through massive document databases. Additionally, anomaly detection and prescriptive analytics are becoming critical tools for the cybersecurity sector of BFSI for fraud detection and anti-money laundering (AML).1

Scientists searching for solutions to the COVID-19 pandemic rely heavily on data acquisition, processing and management in health care applications. They are turning to AI, ML and NLP to track and contain the coronavirus, as well as to gain a more comprehensive understanding of the disease. Among the applications for AI and ML include medical research for developing a vaccine, tracking the spread of the disease, evaluating the effects of COVID-19 intervention, using natural processing of language in social media to understand the impact on society, and more.2

Processing a Data Avalanche

The fuel for BFSI applications like fraud detection, AML applications and chatbots, or health applications such as tracking the COVID-19 pandemic, are decision support systems (DSSs) containing vast amounts of structured and unstructured data. Overall, experts predict that by 2025, 79 trillion GB of data will have been generated globally.3 This avalanche of data is making data mining (DM) difficult for scalar-based high-performance computers to effectively and efficiently run a DSS for its intended applications. More powerful accelerator cards, such as vector processing engines supported by optimized middleware, are proving to efficiently process enterprise data lakes to populate and update data warehouses, from which meaningful insights can be presented to the intended decision makers.

Resurgence of Vector Processors

There is currently a resurgence in vector processing, which, due to the cost, was previously reserved for the most powerful supercomputers in the world. Vector processing architectures are evolving to provide supercomputer performance in a smaller, less expensive form factor using less power, and they are beginning to outpace scalar processing for mainstream AI and ML applications. This is leading to their implementation as the primary compute engine in high performance computing applications, freeing up scalar processors for other mission critical processing roles.

Vector processing has unique advantages over scalar processing when operating on certain types of large datasets. In fact, a vector processor can be more than 100 times faster than a scalar processor, especially when operating on the large amounts of statistical data and attribute values typical for ML applications, such as sparse matrix operations.

While both scalar and vector processors rely on instruction pipelining, a vector processor pipelines not only the instructions but also the data, which reduces the number of “fetch then decode” steps, in turn reducing the number of cycles for decoding. To illustrate this, consider the simple operation shown in Figure 1, in which two groups of 10 numbers are added together. Using a standard programming language, this is performed by writing a loop that sequentially takes each pair of numbers and adds them together (Figure 1a).

Figure 1: Executing the task defined above, the scalar processor (a) must perform more steps than the vector processor (b).

When performed by a vector processor, this task requires only two address translations, and “fetch and decode” is performed only once (Figure 1b) , rather than the 10 times required by a scalar processor (Figure 1a). And because the vector processor’s code is smaller, memory is used more efficiently. Modern vector processors also allow different types of operations to be performed simultaneously, further increasing efficiency.

To bring vector processing capabilities into applications less esoteric than scientific ones, it is possible to combine vector processors with scalar CPUs to produce a “vector parallel” computer. This system comprises a scalar host processor, a vector host running LINUX, and one or more vector processor accelerator cards (or vector engines), creating a heterogeneous compute server that is ideal for broad AI and ML workloads and data analytics applications. In this scenario, the primary computational components are the vector engines, rather than the host processor. These vector engines also have self-contained memory subsystems for increased system efficiency, rather than relying on the host processor’s direct memory access (DMA) to route packets of data through the accelerator card’s I/O pins.

Software Matters

Processors perform only as well as the compilers and software instructions that are delivered to them. Ideally, they should be based on industry-standard programming languages such as C/C++. For AI and ML application development, there are several frameworks available with more emerging. A well designed vector engine compiler should utilize both industry-standard programming languages and open source AI and ML frameworks such as TensorFlow and PyTorch. A similar approach should be taken for database management and data analytics, using proven frameworks such as Apache Spark and Scikit-Learn. This software strategy allows for seamless migration of legacy code to vector engine accelerator cards. Additionally, by using the message passing interface (MPI) to implement distributed processing, the configuration and initialization become transparent to the user.


AI and ML are driving the future of computing and will continue to permeate more applications and services in the future. Many of these application deployments will be implemented in smaller server clusters, perhaps even a single chassis. Accomplishing such a feat requires revisiting the entire spectrum of AI technologies and heterogeneous computing. The vector processor, with advanced pipelining, is a technology that proved itself long ago. Vector processing paired with middleware optimized for parallel pipelining is lowering the entry barriers for new AI and ML applications, and is set to solve the challenges both today and in the future that were once only attainable by the hyperscale cloud providers.


  1. D. Azulay, Artificial Intelligence in Finance – a Comprehensive Overview, December 24 2019, Emerj.
  2. J. Kent, Understanding the COVID-19 Pandemic as a Big Data Analytics Issue, Health IT Analytics, April 2, 2020.
  3. “The Growth in Connected IoT Devices Is Expected to Generate 79.4ZB of Data in 2025, According to a New IDC Forecast,” IDC, June 18, 2019. Click here.

About the Author 

Robbert Emery is responsible for commercializing NEC Corporation’s advanced technologies in HPC and AI/ML platform solutions. His role includes discovering and lowering the entry point and initial investment for enterprises to realize the benefits of big data analytics in their operations. Robbert has developed a career of over 20 years in the ICT industry’s emerging technologies, including mobile network communications, embedded technologies and high-volume manufacturing. Prior to joining NEC’s technology commercialization accelerator, NEC X Inc., in Palo Alto California, Robbert led the product and business plan for an embedded solutions company that resulted in a leadership position, in terms of both volume and revenue. He has an MBA from SJSU’s Lucas College and Graduate School of Business, as well as a bachelor’s degree in electrical engineering from California Polytechnic State University.