Deep Learning Drives Global Financial Institution ‘to Gain Every Little Cent’
It may be true data scientists occupy “the sexiest job of the century,” but it’s also true they're under tremendous pressure to deliver on their rarefied skills, knowledge and pay. We recently spoke (under condition of anonymity) with a data scientist at a North American financial institution, a resource-rich company implementing AI at enterprise scale, and his comments show how Wall Street firms view machine learning as a critical strategic weapon to drive profits and efficiencies.
“There’s a massive drive at all financial institutions, especially here, to drive efficiencies, for us to gain every little cent across the board,” he told us. “…It’s part of our internal KPIs (key performance indicators), to find implementable opportunities for efficiency gains in terms of how we perform. This is part of the master goal of the organization.”
Once an “implementable opportunity” has been identified, he and his team of 20 go about gathering data and preparing it for training deep neural networks to perform the task at hand. For that they use hardware from IBM -- Power9 AC922 servers, which combine Nvidia Tesla GPUs with IBM CPUs and high-bandwidth Nvidia NVLink interconnect.
“The machine itself is very powerful,” he said, “it’s faster than an x86 system because of the way the GPUs connect to the rest of the machine. That connection, the transfer of data between the GPUs and the CPU and the RAM memory, it’s much faster with the NVLink interconnect, so that speeds up the whole thing.”
They also use IBM’s Watson Machine Learning Accelerator software, a combination of open source deep learning frameworks and development and management tools.
“It comes with all the nicely certified open source packages we’re used to using in machine learning and deep learning – Tensorflow, Pytorch , all that stuff, but it comes highly optimized,” he said. “And what’s best really is inside, the software layer … the whole framework, they do everything inside (Watson ML Accelerator), so even parallelizing or distributing the computation between different GPUs within that machine or a cluster of those machines, it becomes basically seamless for the developer.”
The integration and optimization, he said, eliminates much coding that otherwise would be required, coding prone to bugs and consuming “more time spent on IT tasks rather than data science tasks before you can accomplish what you need to do. So there’s that extra-added help from the Watson Accelerator framework that comes already with the Power systems.”
IBM said Watson ML Accelerator is designed to support models of greater complexity and data sets of greater size. With NVLink, the company said, entire models and datasets of up to almost 1 terabyte can be loaded into system memory and cached down across four GPUs within a single server.
Over the past several months, he and three of his staff have aimed deep learning at classification of operational risk incidents, which are routine, day-to-day bank activities that could result in future liabilities. They range from checks incorrectly cleared to a wrong order punched into a trading terminal – any mistake or problem that may lead to later losses from law suits or regulatory fines. Each risk incident needs to be logged into a system and then correctly classified by category.
“This is important because the category that they’re classified under, that will affect the amount of capital we have to hold against potential losses in the future.” Called regulatory capital, it’s a cash set-aside, a provision to cover potential losses. If too little cash is set aside, the institution could have difficulty covering higher-than-expected penalties – while too much cash takes money away that could otherwise be used to conduct bank business.
“It’s a classification problem in the end, a typical machine learning problem,” he said. “The data is text, so there’s natural language processing of unstructured data, reading the text and trying, depending on the sentences in each incident description, to find the right category, which also is a typical sort of machine learning type of activity that you see a lot of.”
High-powered GPU compute is required to train such a system based on historical risk incident data accumulated over decades at the institution, resulting in “thousands and thousands of data points,” he said. Using Tensorflow, he and his team built a system whose priority isn’t so much speed as accuracy. After all, risk incidents don’t happen in high volume, it’s getting the classification right that counts most.
Along with AI, the team also trialed less powerful “commonly used ad hoc approaches involving word counts and word combinations,” he said, to test against the results of the deep learning system. But the ad hoc approaches delivered only 60 to 70 percent accuracy, which he dismissed as “closer to random,” while the deep learning functionality delivers 90 percent accuracy. “And that’s only using sample training data, test data, so that’s a massive gain.”
The system is coming out of prototype and soon will go through final testing, followed by internal presentations for management approval before going into production.
He said risk incident classification system shares characteristics with other ML projects the organization is working on, such as rating of bonds not yet assessed by the rating agencies. Essentially loans in the form of securities, bonds are rated from AAA to junk based on default risk. “Some loans aren’t rated, so … we end up being a bit clueless to what level of risk should be assigned, which also impacts regulatory capital, because every time you take a risk we have to assign regulatory cap to cover potential losses.”
As with risk incident assessment, massive amounts of data from multiple sources is aggregated and processed. “So this functionality would be also classifying, or rating, the bonds not yet rated by rating agencies, using the data we have and deep learning architectures – that’s another one in the making.”
The data science group is also working on detection of fraud: “That’s a big data play, like a terabyte of data per quarter that comes out, basically finding financial crimes, money laundering, and so forth, in the billions of transactions from client accounts, that’s another important project as well.”