Advanced Computing in the Age of AI | Thursday, April 18, 2024

Heterogeneous Computing Gets a Code Similarity Tool 

A machine programming framework for heterogeneous computing championed by Intel Corp. and university partners is built around an automated engine that analyzes code for similarities. The approach could eventually allow non-programmers to create software based on intent.

Intel (NASDAQ: INTC) along with research partners at the Georgia Institute of Technology and Massachusetts Institute of Technology said their Machine Inferred Code Similarity (MISIM) system represents a step toward intent-based programming. The automated framework is designed to probe code structure to determine it intent. It then analyzes other code for differences in syntax that may yield similar behavior.

The researchers claim their approach delivered a 40-fold performance gain over current code similarity systems, opening the door to new applications ranging from automated testing and debugging to code recommendation.

“When fully realized, machine programming will enable everyone to create software by expressing their intention in whatever fashion that’s best for them, whether that’s code, natural language or something else,” said Justin Gottschlich, Intel Lab’s director of machine programming research.

MISIM also seeks to address the growing complexity of heterogeneous computing platforms that mix and match CPU, GPUs, FPGAs along with other processor types and devices. That complexity along with a shortage of programmers has fueled efforts to automate coding. Code similarity frameworks are among the emerging machine coding approaches for generating code that does what it is intended to do.

MISIM is promoted as accurately determining when two pieces of code can perform similar computations whether or not they use different algorithms or data structures. In a research paper published in June, the Intel-Georgia Tech-MIT team said MISIM is based on a new “context-aware semantic structure” along with a neural-based code similarity scoring algorithm. Those core components were implemented in neural network architectures to create the automated code analytics engine unveiled this week.

MISIM was benchmarked against three existing code similarity systems, including Facebook’s Aroma code recommendation tool aimed at developers. The 40-fold performance jump for MISIM is based on evaluations across more than 45,000 programs, the researchers reported.

The special sauce said to differentiate MISIM from earlier code similarity systems is known as CASS, for context-aware semantic structure. Unlike existing platforms, CASS can be configured to a specific context, the researchers said, allowing it to glean detailed information describing code. The result is an automated system that can determine what code does rather than how it does it.

Source: Intel

CASS requires no compiler, meaning it can execute on unfinished code as it is being written. Intel said that new capability promised to advanced automated debugging and recommendations systems.

“Once the structure of the code is integrated into CASS, a number of neural network systems give similarity scores to pieces of code based on the jobs they are designed to carry out,” Intel said. “In other words, if two pieces of code look very different in their structure but perform the same function, these neural networks would rate them as largely similar.”

Intel said MISIM is currently transitioning from the lab to a demonstration phase in which it would serve as the basis for a code recommendation engine used by developers programming across the company’s heterogenous computing architectures. Eventually, MISIM could be integrated into Intel’s software development operations.

“I imagine most developers would happily let the machine find and fix bugs for them, if it could – I know I would,” said Intel’s Gottschlich.

About the author: George Leopold

George Leopold has written about science and technology for more than 30 years, focusing on electronics and aerospace technology. He previously served as executive editor of Electronic Engineering Times. Leopold is the author of "Calculated Risk: The Supersonic Life and Times of Gus Grissom" (Purdue University Press, 2016).