Predicting the Financial Market with Large Language Models
Large Language Models (LLMs) are made of artificial neural networks associated with millions or billions of parameters and trained on massive amounts of data — whether it’s self-supervised learning or semi-supervised learning techniques — to understand and reiterate information. The financial industry has started to leverage these tools for a variety of reasons, including predicting the stock market, financial education, economic advisory, trading strategies, sentiment analysis, and risk management. With the technological advances brought on by ChatGPT, BloombergGPT and FinGPT were developed specifically for the finance sector. All three LLMs have the potential to make an impact on the financial sector.
Two University of Florida professors from the Department of Finance argue that using advanced LLMs in the financial industry can predict more accurate results n the stock market and would only benefit trading strategies. In this study, the authors used ChatGPT to “predict stock market returns using sentiment analysis of news headlines.” They found that ChatGPT — as compared to models such as BERT, GPT-1, and GPT-2 — performed the best and only more advanced models like ChatGPT can analyze large amounts of data to successfully predict the stock market.
ChatGPT is an LLM based on generative pre-trained transformer architecture that was first introduced in November of 2022 by OpenAI, an AI research and deployment company. According to the authors, “the GPT architecture uses a multi-layer neural network to model the structure and patterns of natural language. Using unsupervised learning methods, it is pre-trained on a large corpus of text data, such as Wikipedia articles or web pages.” For this study, the authors used a dataset pulled from the Center for Research in Security Prices daily returns, news headlines, and RavenPack.
The end results of their study only highlight the potential of ChatGPT as a tool for the financial industry in predicting the stock market based on sentiment analysis, the authors said. They also note that more studies are needed.
In March of this year, Bloomberg released its own LLM dubbed BloombergGPT, a 50-billion parameter LLM specifically developed for the financial industry. The propriety BloombergGPT is made up of a 363 billion token dataset pulled from Bloomberg’s data sources, and the dataset also includes 345 billion tokens from general-purpose datasets, according to a research paper published by Bloomberg.
Researchers validated BloombergGPT on finance-specific natural language processing (NLP) benchmarks. The LLM was also validated through Bloomberg’s own suite of internal benchmarks. They found that BloombergGPT compared to LLMs such as GPT-NeoX, OPT66B, BLOOM176B, and GPT-3, BloombergGPT performed the best. Table 1 shows BloombergGPT performance scores across two broad categories of NLP tasks: finance-specific and general-purpose.
“The quality of machine learning and NLP models comes down to the data you put into them,” said Gideon Mann, Head of Bloomberg’s ML Product and Research team. “Thanks to the collection of financial documents Bloomberg has curated over four decades, we were able to carefully create a large and clean, domain-specific dataset to train an LLM that is best suited for financial use cases. We’re excited to use BloombergGPT to improve existing NLP workflows, while also imagining new ways to put this model to work to delight our customers.”
Unlike BloombergGPT, which is based on proprietary knowledge, FinGPT is an open source LLM that was also developed specifically for the financial industry. FinGPT is described as an AI-powered financial consultant released in March of 2023 by Finblox, a crypto trading app backed by Dragonfly and Sequoia. The group’s goal is to democratize LLMs in the finance sector.
"Our mission is to empower users with the knowledge and tools to take control of their financial future," said Peter Hoang, chief executive officer of Finblox. "We are dedicated to making financial literacy and inclusion accessible to everyone. With its user-friendly interface and personalized recommendations, FinGPT represents a significant step towards creating a more inclusive and engaging financial ecosystem."
A team of researchers from Columbia University and New York University (Shanghai) argues that FinGPT can provide access to the resources that researchers and users need to develop LLMs for the financial industry. FinGPT’s dataset is pulled from financial news, social media, filings, trends, and academic setups. FinGPT takes a data-centric approach and embraces a full-stack framework. Two associated codes are publicly available on GitHub here and here.