Advanced Computing in the Age of AI | Saturday, April 27, 2024

Terra Quantum Unveils TQCompressor, Enhancing Large Language Model Efficiency 

ST. GALLEN, Switzerland, March 7, 2024 -- Terra Quantum, a leading quantum technology company, today announced TQCompressor, an algorithm that shrinks large language models (LLMs) while maintaining comparable performance to address the growing demands of generative AI models.

The novel compression technique significantly reduces the size of datasets required for pre-training on targeted tasks compared to other widely used compression methods. The new case study demonstrates the compression of GPT-2 small, achieving a 35% reduction in the number of parameters. The compressed model also demonstrated superior speech generation capabilities relative to other prevalent compressed variants of this ChatGPT predecessor, despite employing up to 97% less data.

In the work, “TQCompressor: improving tensor decomposition methods in neural networks via permutations,” researchers compressed the benchmark model GPT-2 small from 117 million parameters to 81 million and evaluated its performance against other compressed models. When provided with various datasets, including a large collection of Wikipedia articles, the Terra Quantum model produced better results in predicting the next word in a sequence as well as generating coherent text based on contextual understanding.

“This compression algorithm can significantly lower the energy and compute costs of LLMs,” said Markus Pflitsch, CEO of Terra Quantum. “The advancement paves the way for neural network architecture optimizations that streamline GenAI to meet sustainability goals without compromising exceptional performance.”

GPT-2 small, the model presented in the paper, uses the same underpinning language architecture as GPT-2 and ChatGPT. GPT-2 has 1.5 billion parameters while the “small” version has 117 million parameters, representing the smallest of the GPT-2 versions released by OpenAI. Reducing the overall size of these LLMS opens up more use cases.

TQCompressor uses a tensor network technique to restructure connections between neurons of LLMs while preserving structural integrity. TQCompressedGPT-2, now publicly available on Hugging Face, is an advanced neural network model for natural language processing (NLP) tasks that achieves an improvement in efficiency and expressivity over GPT-2.

“When neural networks are compressed, they often lose expressivity – the ability to capture and represent complex patterns and relationships in data,” said Aleksei Naumov, AI Engineer at Terra Quantum and lead author of the paper. “Our optimization of the neural network enables a more effective compression process that mitigates loss in the model’s expressivity to deploy the AI model efficiently and effectively.”

The model also outperforms other GPT-2 compressions in perplexity scores, which measure how well language models accurately predict data, Naumov said. TQCompressedGPT-2 achieved better scores than popular compressed models like DistilGPT-2 across all benchmarking datasets.

The significant time, computation and energy resources required continue to rise for these large NLP models, particularly for training. Some researchers estimate that the carbon emissions from training a popular model exceed that of 6- passenger jet flights between San Francisco and New York. If every Google search integrated LLM-generated results, the annual electricity required could equal the consumption of Ireland.

Quantum-inspired techniques are a potential solution to this problem. TQCompressor reduced the training requirements of the model to prove the potential of tensor networks to streamline machine learning applications and develop more efficient LLMs, according to the company’s researchers.

Further research could explore the application of Terra Quantum’s compression techniques for larger use cases such as ChatGPT.

Generative AI offers the potential to transform industries across finance, healthcare, education and many more, and its impact can be further amplified with quantum computing. Quantum-inspired tensor network methods like TQCompressor open the door to transforming AI and NLP.

About Terra Quantum

Terra Quantum Group is a leading quantum technology company based in Germany and Switzerland. It provides “Quantum as a Service (QaaS)” in three core areas, the first one being “Quantum Algorithms as a Service.” Here, customers are provided access to an extensive library of algorithms, such as hybrid quantum optimization and hybrid quantum neural networks, which can be used for solving complex logistics problems or pattern recognition, among other things. Terra Quantum also develops new quantum algorithms for its customers or adapts existing algorithms to their specific needs. Secondly, through “Quantum Computing as a Service,” Terra Quantum offers its customers access to its proprietary high-performance simulated quantum processing units (QPU), the quantum ecosystem’s physical QPUs, while also developing native QPUs. The third division is “Quantum Security as a Service,” through which Terra Quantum offers its unique solutions for secure quantum and post quantum communications worldwide.


Source: Terra Quantum

EnterpriseAI