Advanced Computing in the Age of AI | Tuesday, March 19, 2024

Exponential: Today and Tomorrow’s Expansive AI-Big Data-HPC Landscape 

Jay Boisseau, Dell EMC’s chief HPC technology strategist, delivered an incisive keynote address at Tabor Communications’ Advanced Scale Forum this week. He covers too much ground to summarize his remarks in a few words. Let's just say he offered a comprehensive, high-level view of Big Data, AI, HPC, where everything is going – all that. Here are excerpts:

When did we get big data? Scientists will tell you’ve they always been able to compute an incredible amount of output data, simulation data. But we really got big data when we started recording, measuring and collecting lots of input data about the world around us. ….we didn’t really start using the phrase big data until the enterprise embraced it, not when science could produce it synthetically, but when the enterprise had access to big data.

So one of the new big trends after HPC, big data and cloud, two of the most exciting to me: AI and particularly a sub field of AI called deep learning and IoT. …these leverage and build on HPC and big data in ways I’ll talk about….

The history of supercomputing and high performance computing, most of it is simulation-based science. It’s really been in the last decade that we’ve seen the onset of HPDA and mostly through a community like this, around enterprise…

Now we’re in the era in which almost all HPC is scalable on big clusters. In fact data centers are starting to look like HPC systems. I need to give Jimmy Pike (SVP and Senior Fellow, Dell EMC) credit for saying this to me a couple of years ago, and I poo-pooed it, saying “we love HPC but let’s not say everything is HPC.” When Jensen Huang and Nvidia bought Mellanox, he said at GTC that data centers are starting to look like HPC systems and they need an HPC fabric for it. So Jimmy should get credit for that observation, though Jensen made it more famous for saying it as GTC….

On this slide, the image on the right, it’s a really big number and it’s an exponential growth. And we take exponential growth for granted a lot. People don’t think in terms of exponentials, we can recognize it in math and identify it in a curve when we see it, but we tend to think tomorrow will be like today, or tomorrow will be incrementally better than today; we don’t really subconsciously think about the power of exponentials, even in computers.... These exponentials are really powerful things and we see this in data growth.

These exponentials are not always forever, right? Smart phone growth has leveled off; it had a big exponential for a while, but some of them have natural ceilings. How many smart phones are there going to be in the world? One per person? That’s a natural ceiling, so you might get exponential growth for a while, then it levels out.

We don’t see this happening in data any time soon, possibly ever, because there are so many sources of this data as well as the capability of each of those sources to make more data….

…AI (is) in the realm of computers, the internet, space flight and more recently there’s talk about CRISPR, gene editing, it’s in that small class of things that will have fundamental transformation on society. Our own founder, (Michael Dell) he really believes the next 30 years will make the past 30 look like child’s play, and that’s the power of exponentials….

(Several) factors all contribute to a steep AI market projection. This is an exponential, too, but is this exponential going to happen forever? Of course not… IT budgets aren’t going to increase exponentially. (AI) is much less than the overall IT spend and that’s why it’s an exponential. At some point it has to level out, assuming CIO budgets aren’t going to increase exponentially forever….

An AI learning system is much harder than a database and it requires having access to more data and breaking down some of the historical siloes that were put in place for very good reasons in the era of database reports. But now you want to break down those silos. We have a great example at Dell of trying to predict possible hard drive failures in enclosures and they had to collect data from sales databases and shipping logistics databases as well as maintenance databases and other customer databases – and put it together into a big deep learning model. No one part of Dell had ever had to assemble that into such model before. But it really gave them a great capability for identifying some possible issue scenarios that they would never have identified – that no human would have ever actually identified – so deep learning often augments our capabilities but sometimes exceeds them.

So I would say what we’re seeing is a confluence of workloads and what we want is a converged system to enable those workloads….

AI is a workload; AI is solving certain kinds of problems. Not all AI workloads require HPC. I have a $30 camera in my condo right now that can do object detection and motion tracking. It’s got an AI-trained model in it that is inferencing. It’s not doing the training part that requires HPC, but the inferencing part does not….

AI is a big field and there are many approaches to AI.… Expert systems were basically rules-based attempts at AI. Machine learning is a data-driven approach to AI where you let the model determine the features and rules from the data that it gets…. Deep learning uses a very sophisticated set of algorithms, these neural network-based algorithms, to get really sophisticated answers….

Machine learning is condensing data into a high dimensional probability model and you might be using it for classifying. Is that a cat in that image? Is that a dog…? You might be doing inferencing or judgement or even prediction because these AI techniques can find patterns in data, and one of the variables in the data might be time dependence.

You could look at a bunch of scans of pre-tumorous images, some of which evolved into tumors, and train an AI method to predict what might evolve into a tumor, and you can actually get that to be better than a human. It’s really a two phase process – you have training data, which you do a massive amount of number crunching on, and then you get this little tiny model at the end for inferencing….

Deep neural networks are really powerful for learning about features in data and this is important for things that the classical machine learning methods are not yet great at. They are very good at so many problems – for finding fine details in data, including ones for which you don’t have rules and you want to determine the rules, as well as the features you use deep neural networks for that…. deep neural networks allow you to find features in data and even determine the rules about those features. It is why I could teach any of you to find a cat in images even if you've never seen mammals before I could probably teach you to do it in about 20 some images the psychologists say. For AI, it may take 10,000 damage but eventually it will have enough rules about those images to determine what it thinks is a cat.

So this is why you need big computing and deep learning. Modeling and simulation will forever be important. The laws of nature are not going anywhere if you’re doing anything that depends on physics, anything that’s structural or depends on fluids or gasses or electromagnetism or any of this, you want to understand the science; you don't want to always make statistical guesses. You’d like to actually know the first principles what is going to happen how do I design that vehicle or that system. But you also must embrace analytics and consume machine learning and deep-learning. There are many things for which we do not have an appropriate first principles model.

Dell EMC's Jay Boisseau

What color shoes are teens going to buy next quarter? There is no Maxwell equation for that. There are humans that are pretty good at fashion and trends and trying to make predictions, but you can also mine social media data and see what's becoming hot, and look at the duration of trends over history and try to estimate how long fashion trends may last and so on. You can look at marketing data and see how advertising changed in the past and make projections on where to inject those advertising dollars for the maximum chance of things going viral…. So analytics, machine learning and deep learning are all so important….

When you do your shopping OEM systems in your data center you compare, you compare all the options…. You should be able to do that with clouds. Now we all know that clouds are really good at providing APIs that give stickiness and to some degree lock-in – and at least one of them charges you to get the data out, so they're not really interested in you comparing them against each other; they are interested in competing against on-prem and locking you into that choice. We think it should all be choice and so we really believe hybrid multi-cloud is the goal for HPC as well as enterprise IT.

So we want to make sure you have all options available. …you want to have the data and analytics to know when to buy what to use what. This is actually a more complicated thing and I've talked to customers all the time who think that cloud is cheaper or on-prem is cheaper – and they haven't really done the math. If you do the math, on-prem will always be cheaper for something fully utilized. But cloud, public cloud, can be cheaper for something you don’t fully utilize. Public cloud can give you elasticity; on-prem gives you control – they all have advantages…. Too often we see somebody say something like: a $10 million computer is more expensive when I can get this for 10 cents per hour. But they haven’t done the math to even compare the units. We want to make sure CIOs and CFOs have the tools for proper planning….

IOT is really about outcomes. It’s not about engines, it’s about measuring thrusts. It’s not about thermostats, it’s about determining comfort levels. It really does drive results. Dominos revenue went up 6,000 percent over a decade after they embraced IOT, including that little pizza tracker app. They put their app on smart TVs and stuff; they even made it so you could tweet a pizza order. So they really embraced IOT and increased their market share from 8 percent to 17 percent and became the number one pizza distributor, well ahead of Pizza Hut, in large part by embracing IOT.

What we need to do is invest in industry level-interoperability and you see this logo here for EdgeX Foundry. This is Dell’s approach to it and an increasing number of other partners’ approaches. It is not a Google or Azure or Amazon IOT-to-cloud system. It is a Linux Foundation-supported open source project that any company or nonprofit can join, and we're hoping it becomes the way to prevent lock-in and truly enable an internet of things, not an internet of Amazon things and an internet of Google things and an internet of vendor X’s things….

AI will have no more winters. People talk about (upcoming) AI winter because it’s come in spikes over the last few decades. The only reason there were spikes is because there were spikes in academic, productive research, but (there was) always a lack of data and computing power to fulfill it until now. There’s no more lack of data and computing power – we see lots of successes. I truly don’t think there will be any more AI winters. That doesn’t mean that we won’t develop even better techniques that may replace even the current AI techniques, but there won’t be a lack of opportunity to use AI.

And finally, this really is the beginning of the opportunity. This meeting has been a prescient meeting…in helping companies accelerate into using those technologies, but I would argue we are still in the early days. Based on the conversations we’ve had here, many people know about one or two of these technologies but not all of them – but it’s really a case where the whole is going to be greater than the sum of its parts.

It’s good to start accumulating this expertise, do pilot projects and understand how each of these technologies can help because we live in an era of rapid technology change. The rate of change is increasing. Your number of hours in the day aren’t. That means your competitors can get ahead of you more easily now because things are changing so much faster. So it’s more important than ever to stay on top of this.

EnterpriseAI