Exascale Supercomputers and AI Self-Driving Cars
By Lance Eliot, the AI Trends Insider
Supercomputers are great. If you are really into computing and computers, you’ve got to admire and be fascinated by supercomputing. It is similar to really being into cars and keeping in tune with the fastest sports cars and pushing the limits on technology of one kind or another. Just as most of us cannot afford those high-priced souped-up roadsters that cost a big wad of cash, same too can be said about supercomputers. The only players in supercomputers are those that can plunk down tons of dough (well, at least for those that own supercomputers, such as huge companies or national agencies; I’ll say more about using rather than buying them, later-on herein).
One of my favorite supercomputers was the Cray-1. It was brought to the world by computer inventor extraordinaire Seymour Cray in the mid-1970s and ran at an astounding 200 MFLOPS (M is for Mega, FLOPS is for floating point operations per second). This was super-fast at the time. There was a popular insider joke at the time. The joke was that the Cray-1 was so fast that it could complete an infinite loop in less than one second.
For those of you that haven’t fallen to the floor in laughter, the joke is that an infinite loop presumably would never end and so the Cray-1 was so tremendously fast that it could even finish an infinite loop. Ha! By the way, you can pretty much use that joke still today and just mention the name of a more contemporary supercomputer (you’ll be the life of any party).
The current reining champ of supercomputers is the Summit supercomputer at Oak Ridge National Labs (ORNL). In June of this year (2018), the Summit was crowned the fastest supercomputer and placed at the top of the classic and ever-popular Top500 list (this is a listing of the top supercomputers ranked by speed and it is fun to keep tabs to see who makes the list and what their rank is). Similar to chess masters and their rankings, anyone into supercomputers knows at least who the top 10 are on the Top500 list and likely has familiarity with at least the top 30 or so.
Summit is rated at about 122.3 PFLOPS (P is for Peta, which is 1,000 million million). In theory, if Summit could just go all out and run at a maximum raw speed, it presumably could do about 200 PFLOPS. As they say, we’ve come a long way, baby – meaning that if you compare the Cray-1 at 200 MFLOPS versus today’s Summit at 200 PFLOPS, the speed difference is like night versus day.
It is said that to be able to do as many calculations per second as can Summit, every person on Earth would need to be able to perform around 16 million calculations per second. Why don’t we try that? Let’s get everyone to stop what they are doing right now, and perform 16 million calculations, doing so in one second of time. Might be challenging.
Maybe one way to think of the vast growth in speed from the Cray-1 days to the Summit involves considering space rather than time, in terms of if you had something that could store data on the basis of megabytes you might be able to keep a few written novels in that amount of space, while in comparison for petabytes you could keep perhaps all of the data contained in the United States libraries (please note that’s a rough approximation and only intended to suggest the magnitude difference).
Let’s consider the prefixes used and the amounts involved:
Mega = 10 ^ 6
Giga = 10 ^ 9
Tera = 10 ^ 12
Peta = 10 ^ 15
Exa = 10 ^ 18
I’m using the symbol “^” to mean “to the power of” and for example the Mega is 1 x 10 to the 6th power, while Giga is 1 x 10 to the 9th power, and so on.
Having the fastest supercomputer is considered at times an indicator of who is “winning” the race in terms of advancing computers and computing.
Right now, the United States holds the top slot with Summit, but for the last several years it was China with their Sunway TaihuLight supercomputer. How long will the United States hang onto the Number 1 position in supercomputers? Hard to say. The official Top500 list is updated twice per year. You’ve got the United States, China, Europe, Japan, and other big players all vying to get onto the list and get to the very top of the list. Some predict that Japan’s Post-K might make the top in 2020, and the United States might reclaim the title in 2021, though China might move back into the top slot during those time frames too (it can be hard to predict because the big players are all in the midst of developing newer and faster supercomputers but the end-date of when they will be finished is often hazy or not revealed).
Is it fair to say that whichever country makes or has the fastest supercomputer is leading the race towards advancing computers? Probably not, but it is an easy way to play that game and one that many seem to believe merits attention (note that the Summit was developed at an estimated cost of around $200 million, which per my point earlier emphasizes that you need a big wallet to make one of these supercomputers).
To benchmark the speed, it is customary to have supercomputers run the famous LINPACK benchmark software. LINPACK was originally a set of Fortran program routines for doing various kinds of algebraic mathematics and it eventually became associated with being a benchmark for computer speed (today you might use LAPACK in lieu of LINPACK, if you are in need of a set of routines for algebraic related aspects). The handiness of the LINPACK benchmark is that it involves the computer doing a “pure calculation” kind of problem by trying to solve a system of linear equations. In that sense, it is essentially restricted to the use of straight-ahead floating-point operations and akin to perhaps having a horse run flat-out on a track as fast as it can.
Some criticize the use of such a benchmark as somewhat off-kilter because supercomputers are likely going to be doing more than relatively simplistic mathematical calculations. Such critics say that it distorts the design of the supercomputer by having the supercomputer makers aim to do maximum FLOPS and not necessarily be able to do other kinds of computer-related tasks very well.
Like it or not, the world has seemed to agree to the ongoing use of the LINPACK benchmark, actually, more formally usually the HPL (High Performance LINPACK) which is optimized more so for this kind of benchmarking. It would be hard to get everyone to switch to some other benchmark and also would make it difficult to make comparisons to previous rankings. This same kind of argument happens in sports such as proposals to change some of the rules of football or baseball, and in so doing it can make prior records become no longer relevant and readily usable.
Electric Power Measured in FLOPS Per Watt
One concern that some raise is the vast amount of electrical power often consumed by these supercomputers. The amount of electrical power usage is often expressed in FLOPS per watt (the Summit is about 13.889 GFLOPS per watt). Some believe that the proper ranking of supercomputers should be a combination of the raw speed metric of supercomputers by the electrical power consumption metric, which then would perhaps force the supercomputer designers to be more prudent about how much electrical power is being used. Instead, there are really two lists, the raw speed list and the other list is the electrical power efficiency list. The glory tends to go toward the raw speed list.
This indication about electrical power consumption brings up another significant point, namely that the cost to run a supercomputer is about as astronomical as is the outright price of the supercomputer.
Besides the need for lots of electricity, another noteworthy factor in supercomputer design involves the heat that can build-up in the supercomputer. With lots of fast processors comes the generation of heat. The closer you try to put these processors to each other, the more heat that you have in a tightened area. You want to put the processors as close as possible to each other so as to minimize the delay times of the processors communicating with each other (the more the distance between the processors, the longer the latency times).
So, if you are packing together thousands of processors, doing so to add speed and reduce latency, you also tend to get high heat density. Life is always one kind of a trade-off versus another. One of the most popular cooling methods for supercomputers involves using liquid cooling. It might seem odd to consider putting liquid (of any kind) anywhere near the electrically running processors, but you can nonetheless have tubes of liquid to help bring coolness to the processors and aid in dissipating heat from them. Air cooling is also possible.
The Cray-1 was known for its unusual shape, consisting of a main tower that was curved like the letter “C” and had a concentric bench around it. It was described at the time as the world’s most costly loveseat due to the unique physical design. Looking at it from above, you could see that it had the shape of the letter “C” and was said to be designed in that manner to reduce the distance between the processors and aid the use of the Freon cooling system (note that it was also suggested that Cray liked the notion of his supercomputer spelling out the letter of his last name!). If you’d like to see and touch one of the original Cray-1’s you can do so at the Computer History Museum in Mountain View, California.
Here’s a question for you. Which would you prefer to do, have your supercomputer uses lots and lots of off-the-shelf processors or have it contain lots of specialized processors made specifically for the supercomputer?
That’s a big design question for any supercomputer. Using processors that already exist is certainly easier because you don’t need to design and build new processors. Instead, your focus becomes how to best hook them up with each other. But, you are also then stuck with however good (or bad) those processors are in terms of speed of their individual performance. As Seymour Cray had remarked back in the days of the early arguments about whether to use off-the-shelf versus specialized processors (he favored specialized processors), he oft would say that if he was trying to plow a field, he’d rather use 2 oxen in lieu of using 1,024 chickens.
A slight twist on which processors to use has emerged due to the advent of Graphical Processing Units (GPU’s). GPU’s were originally developed as processors intended to be dedicated to tasks involving graphics display and transformations. They kept getting pushed to be faster and faster to keep up with the evolving desire for clean and fully streaming graphics. Eventually, it was realized that you could make a General Purpose GPU (GPGPU), and consider using those unconventional non-traditional processors as the basis for your supercomputer.
Some though say that you ought to go with the stripped-down bare bones kind of processors that can be optimized for pure FLOPS kind of speed. Reduced Instruction Set Computing (RISC) processors arose to take us back to a time when the processor wasn’t overly complex and you could optimize it to do some fundamental things like maximize FLOPS. Perhaps one of the most notable such trends was indicated by the Scalable Processor Architecture (SPARC) that was promulgated by the computer vendor Sun.
Often referred to as High Performance Computing (HPC), supercomputers exploit parallelism to gain their superfast speeds. Massively Parallel Processing (MPP) consists of having a massive number of processors that can work in parallel. One of the great challenges of truly leveraging the parallelism involves whether or not whatever you are computing with your MPP can be divided up into pieces to sufficiently make use of the parallel capability.
If I go to the store to go shopping and have a list of items to buy, I can only go so fast throughout the store to do my shopping. I might optimize my path to make sure that I get each item in a sequence that reduces how far I need to walk throughout the store. Nonetheless, I’m only one person, I believe, and thus there’s only so much I can do to speed-up my shopping effort.
On the other hand, if I added an additional person, we potentially could speed-up the shopping. We could possibly shop in parallel. Suppose though that I had only one copy of the shopping list and we both had to walk around the store together while shopping. Probably not much of a speed-up. If I could divide the shopping list into two parts, giving half to the other person and my keeping half, we now might have a good chance of speeding things up.
If I am not thoughtful about how I divide up the list of shopping items, it could be that the speed-up won’t be much. I need to consider a sensible way to leverage the parallelism. Imagine too if I got three more people to help with the shopping. I’d want to find a means to further subdivide the master list in a sensible manner that tries to gain as much speed-up as feasible via the parallelism.
As such, suppose that you’ve got yourself a supercomputer like the Summit, and it contains over 9,000 22-core CPU’s (IBM Power9’s) and another 27,000+ GPU’s (NVIDIA Tesla V100’s). It takes up an area about the size of two tennis courts, and it uses about 4,000 gallons of water per minute to cool it.
You decide to have it play tic-tac-toe with you.
It would seem doubtful that you would need this kind of impressive “hunk of iron” that the Summit has, in order to play you in such a simple game. How many processors would you need to use for your tic-tac-toe? Let’s say you devote a handful to this task, which is more than enough. Meanwhile, the other processors are sitting around with nothing to do. All those unused processors, all that used up space, all that cost, all that cooling, and most of the supercomputer is just whistling Dixie while you are playing it in tic-tac-toe.
The point being that there’s not much value in having a supercomputer that is superfast due to exploiting parallelism if you are unable to have a problem that can lend itself to utilizing the parallel architecture. You can essentially render a superfast computer into being a do-little supercomputer by trying to mismatch it with something that won’t scale-up and use the parallelism.
What to Do With a Supercomputer
What kind of uses can a supercomputer be sensibly put to? The most common uses include doing large-scale climate modeling, weather modeling, oil and gas exploration analysis, genetics analysis, etc. Each of those kinds of problems can be expressed into a mathematical format and can be divided up into parallel efforts.
Sometimes such tasks are considered to be “embarrassingly parallel,” which means that they are ready-made for parallelism and you don’t need to go to a lot of work to figure out how to turn the task into something that uses parallelism. I am not trivializing the effort involved in programming these tasks to use the parallelism and only suggesting that sometimes the task presents itself in a manner that won’t require unimaginable ways of getting to a parallel approach. If you don’t like the use of the word “embarrassingly” then you can substitute it with the word “pleasingly” (as in “pleasingly parallel” meaning the task fits well to being parallelized).
Whether you use RISC or GPGPU’s or anything conventional as your core processor, there are some critics of this “traditionalist” approach to supercomputers that say we’ve got to pursue the whole supercomputer topic in an entirely different way. They ask a simple question – do humans think by using FLOPS? Though we don’t yet really know how the human brain works, I think it is relatively fair to assert that humans probably do not use FLOPS in their minds.
For those of us in the AI field, we generally tend to believe that aiming at neurons is a better shot at ultimately trying to have a computer that can do what the human brain can do. Sure, you can simulate a neuron with a FLOPS mode conventional processor, but do we really believe that simulating a neuron in that manner will get us to the same level as a human brain? Many of the Machine Learning (ML) and Artificial Neural Network (ANN) advocates would say no.
Instead, it is thought that we need to have specialized processors that act more like neurons. Note that they are still not the same as neurons, and you can argue that they are once again just a simulation of a neuron, though the counter-argument is yes that’s true, but they are closer to being like a neuron than conventional processors are. You are welcome to go back-and-forth on that argument for about five minutes, if you wish to do so, and then continue ahead herein.
These neuron inspired supercomputers are typically referred to as neuromorphic supercomputers.
Some exciting news occurred recently when the University of Manchester announced that their neuromorphic supercomputer now has 1 million processers. This system uses the Spiking Neural Network Architecture known as SpiNNaker. They were able to put together a model that contained about 80,000 “neurons” and had around 300 million “synapses” (I am putting quotes around the words neuron and synapse because I don’t want to conflate the real biological wetware with the much less equivalent computer simulated versions).
It is quite exciting to see these kinds of advances are occurring in neuromorphic supercomputers and it bodes well for what might be coming down the pike. The hope is to aim for a model with 1 billion “neurons” in it.
Just to let you know, and I am not trying to be a party pooper on this, but the human brain is estimated to have perhaps 100 billion neurons and maybe 1 quadrillion synapses. Even once we can get a 1 billion “neurons” supercomputer going, it will still only represent perhaps 1% of the total number of neurons in a person’s head. Some believe that until we are able to reach nearer to the 100 billion mark that we will not be able to do much with the lesser number of simulated neurons. Perhaps you need a certain preponderance of mass of neurons before intelligence can emerge.
Though we might not be able to approach soon the simulations needed for “recreating” human-like minds, we can at least perhaps do some nifty explorations involving other kinds of creatures.
A lobster has about 100,000 neurons, while a honey bee has about 960,000, and a frog around 16,000,000. A mouse has around 71,000,000 neurons and a rat about 148,000,000. A dog has around 2 billion neurons, while a chimpanzee has about 28 billion. Hopefully, we can begin to do some interesting explorations of how the brain works via neuromorphic computing for these creatures. But, be forewarned, using only the count of neurons is a bit misleading and there’s a lot more involved in getting toward “intelligence” that exists in the minds of any such animals.
There’s another camp or tribe in the processors design debate that argues we need to completely rethink the topic and pursue quantum computers instead of the other ways of approaching the matter.
If we can really get quantum superposition and entanglement to work to our bidding (key structural elements of quantum computers, which are only being done in research labs and experimentally right now), it does appear that some incredible speed-up’s can be had in terms of “classical” computing. The quantum advocates are aiming to achieve “quantum supremacy” over various aspects of classical computing. For now, it’s worthwhile to keep in mind that Albert Einstein had said that quantum entanglement was spooky and so the race to create a true quantum computer might bring us closer to understanding mysteries of the universe such as the nature of matter, space, and time, if we can get practical quantum computers to be achieved.
In a future posting, I’ll be covering the topic of quantum computers and AI self-driving cars.
In terms of conventional supercomputers, the race currently is about trying to get beyond petaflops and reach the exalted exaflops.
An exaFLOPS is the equivalent of 1,000 petaFLOPS. I had mentioned earlier that Summit can top off at 200 petaFLOPS, but through some clever tricks they were able to apparently achieve 1.88 exaFLOPS performance for a certain kind of genomes problem and reach 3.3 exaFLOPS for certain kinds of mixed precision calculations. This is not quite a true unvarnished onset of exaFLOPS and so the world is still waiting for a supercomputer that can reach the exaFLOPS in a more sustainable traditionalist conventional sense.
I think you ought to get a bumper sticker for your car that says exascale supercomputers are almost here. Maybe by 2020 or 2021 you’ll be able to change the bumper sticker and say that exascale computing has arrived.
Speaking of cars, you might be wondering what does this have to do with AI self-driving cars?
At the Cybernetic AI Self-Driving Car Institute, we are developing AI software for self-driving cars. Supercomputers can be a big help toward the advent of AI self-driving cars.
Allow me to elaborate.
I’d like to first clarify and introduce the notion that there are varying levels of AI self-driving cars. The topmost level is considered Level 5. A Level 5 self-driving car is one that is being driven by the AI and there is no human driver involved. For the design of Level 5 self-driving cars, the auto makers are even removing the gas pedal, brake pedal, and steering wheel, since those are contraptions used by human drivers. The Level 5 self-driving car is not being driven by a human and nor is there an expectation that a human driver will be present in the self-driving car. It’s all on the shoulders of the AI to drive the car.
For self-driving cars less than a Level 5, there must be a human driver present in the car. The human driver is currently considered the responsible party for the acts of the car. The AI and the human driver are co-sharing the driving task. In spite of this co-sharing, the human is supposed to remain fully immersed into the driving task and be ready at all times to perform the driving task. I’ve repeatedly warned about the dangers of this co-sharing arrangement and predicted it will produce many untoward results.
Let’s focus herein on the true Level 5 self-driving car. Much of the comments apply to the less than Level 5 self-driving cars too, but the fully autonomous AI self-driving car will receive the most attention in this discussion.
Here’s the usual steps involved in the AI driving task:
- Sensor data collection and interpretation
- Sensor fusion
- Virtual world model updating
- AI action planning
- Car controls command issuance
Another key aspect of AI self-driving cars is that they will be driving on our roadways in the midst of human driven cars too. There are some pundits of AI self-driving cars that continually refer to a Utopian world in which there are only AI self-driving cars on the public roads. Currently there are about 250+ million conventional cars in the United States alone, and those cars are not going to magically disappear or become true Level 5 AI self-driving cars overnight.
Indeed, the use of human driven cars will last for many years, likely many decades, and the advent of AI self-driving cars will occur while there are still human driven cars on the roads. This is a crucial point since this means that the AI of self-driving cars needs to be able to contend with not just other AI self-driving cars, but also contend with human driven cars. It is easy to envision a simplistic and rather unrealistic world in which all AI self-driving cars are politely interacting with each other and being civil about roadway interactions. That’s not what is going to be happening for the foreseeable future. AI self-driving cars and human driven cars will need to be able to cope with each other.
Returning to the topic of supercomputers, let’s consider how today’s supercomputers and tomorrow’s even faster supercomputers can be advantageous to AI self-driving cars.
Suppose you were an auto maker or tech firm that had access to an exascale supercomputer. You have 10 ^ 18 exaFLOPS available to do whatever you want with those enormous processing cycles.
Exploring Exascale Supercomputer Working with AI Self-Driving Car
First, you can pretty much cross off the list of possibilities the notion that you would put the exascale supercomputer on-board of an AI self-driving car. Unless the self-driving car is the size of about two football fields and has a nuclear power plant included, you are not going to get the exascale supercomputer to fit into the self-driving car. Perhaps some decades from now the continued progress on miniaturization might allow the exascale supercomputer to get small enough to fit into a self-driving car, which I say now so that decades from today no one can look back and quote me as suggesting it could never happen (just like those old predictions that mainframe computers would never be the size of PCs, yet that’s somewhat what we have today).
As an aside, even though the exascale supercomputer won’t fit into a self-driving car, there are a lot of software related techniques that can be gleaned from supercomputing and be used for AI self-driving cars. One big plus about supercomputing is that it tends to push forward on new advances for operating systems (typically a Linux-derivative), and for databases, and for networking, and so on. That’s actually another reason to want to have supercomputers, namely that it usually brings forth other kinds of breakthroughs, either software related ones or hardware related ones.
In any case, throw in the towel about getting a supercomputer to fit into an AI self-driving car. Then what?
Let’s consider how a supercomputer and an AI self-driving car might try to work with each other. Keep in mind that there is only so much computing processing capability that we can pack into an AI self-driving car. The more processors we jam into the self-driving car, the more it uses up space for passengers, and the more it uses up electrical power, and the more heat it generates, and importantly the costlier the AI self-driving car is going to become.
Thus, the aim is the Goldilocks approach, having just the right amount of processing capability loaded into the AI self-driving car. Not too little, and not too much.
It would be handy to have an arrangement whereby if the AI self-driving car needed some more processing capability that it could magically suddenly have it available. Via the OTA (Over-The-Air) capability of an AI self-driving car, you might be able to tap into a supercomputer that’s accessed through the cloud of the auto maker or tech firm that made the AI system.
The OTA is usually intended to allow for an AI self-driving car to upload data, such as the data being collected via its multitude of sensors. The cloud of the auto maker or tech firm can then analyze the data and try to find patterns that might be interesting, new, and useful. The OTA can also be used to download into the AI self-driving car the latest software updates, patches, and other aspects that the auto maker or tech firm wants to be on-board of the self-driving car.
Normally, the OTA is generally considered a “batch oriented” kind of activity. A batch of data is stored on-board the self-driving car and when the self-driving car is in a posture that it can do a heavy sized upload, it does so (such as parked in your garage at home, charging up, and having access to your home high-speed WiFi, or maybe doing so at your office at work). Likewise, the downloads to the AI self-driving car tend to take place when the self-driving car is not otherwise active, making things a bit safer since you would not want an in-motion AI self-driving car on the freeway to suddenly get an updated patch and maybe either be distracted or get confused by the hasty change.
Of course, a supercomputer could be sitting there in the cloud and be used for aiding the behind-the-scenes aspects of analyzing the data and helping to prepare downloads. In that sense, the AI self-driving car has no real-time “connection” or collaboration with the supercomputer. The supercomputer is beyond the direct reach of the AI self-driving car.
Suppose though that we opted to have the supercomputer act as a kind of reserve of added computational capability for the AI self-driving car? Whenever the AI self-driving car needs to ramp-up and process something, it could potentially seek to find out if the exascale supercomputer could help out. If there is a connection feasible and the supercomputer is available, the AI self-driving car might provide the supercomputer with processing tasks and then see what the supercomputer has to report back about it.
Imagine that the AI self-driving car has been driving along on a street it has never been on before. Using somewhat rudimentary navigation, it successfully makes its way along the street. Meanwhile, it is collecting lots of data about the street scene, including video and pictures, radar images, LIDAR, and the like. The AI self-driving car pumps this up to the cloud, and asks the supercomputer to rapidly analyze it.
Doing a full analysis by the on-board processors of the self-driving car would take a while to do, simply because those processors are much slower than the supercomputer. Furthermore, the on-board processors are doing already a lot of work trying to navigate down the street without hitting anything. It would be handy to push over to the supercomputer the data and see what it can find, perhaps being able to do so more quickly than the AI self-driving car.
The speed aspects of the supercomputer allow a deeper analysis too than what the on-board processors could likely do in the same amount of time. It is like a chess match. In chess, you try to consider your next move and the move of your opponent in response to your move. If the chess clock allows enough time, you should also be considering how you will move after your opponent has moved, and then your next move, and so on. These are called ply and you want to try and look ahead as many ply as you can. But, given constraints on time, chess players often can only consider a few ply ahead, plus it can be mentally arduous to go much further ahead in contemplating next moves.
The AI on-board the self-driving car might be doing a single ply of analyzing the street scene, meanwhile it provides the street scene data to the supercomputer via the cloud and asks it to simultaneously do an analysis. The supercomputer might then let the AI on-board the self-driving car know that ahead of it is a tree that might be ready to fall onto the road. The AI on-board had only recognized that a tree existed at that location but had not been able to do any kind of analysis further about it. The supercomputer did a deeper analysis and was able to discern that based on the type of tree, the angle of the tree, and other factors, there is a high chance of it falling over. This would be handy for the AI on-board the self-driving car to be aware of and be wary of going near the tree.
Notice that the AI on-board did not necessarily need the use of the supercomputer per se. The AI was able to independently navigate the street. The supercomputer was considered an auxiliary capability. If the supercomputer was available and could be reached, great. But, if the supercomputer was not available or could not be reached, the AI was still sufficient on-board to do whatever the driving task consisted of.
For the advent of 5G, see my article:
The approach of having the AI on-board the self-driving car make use of a supercomputer in the cloud would not be so straightforward as it might seem.
The AI self-driving car has to be able to have an electronic communications connection viable enough to do so. This can be tricky when an AI self-driving car is in-motion, perhaps moving at 80 miles per hour down a freeway. Plus, if the AI self-driving car is in a remote location such as a highway that cuts across a state, there might not be much Internet access available. It is hoped that the advent of 5G as a WiFi standard will allow for improved electronic communications, including speed and being available in more places than traditional WiFi.
The electronic connection might be subject to disruption and therefore the AI self-driving car has to be wary of requiring that the supercomputer respond. If the AI is dependent entirely on the supercomputer to make crucial real-time decisions, this would most likely be a recipe for failure (such as crashing into a wall or otherwise making a bad move). That’s why I earlier phrased it as a collaborative kind of relationship between the AI on-board and the supercomputer, including that the supercomputer is considered an auxiliary ally. If the auxiliary ally is not reachable, the AI on-board has to continue along on its own.
Another somewhat similar approach could be the use of edge computing to be an auxiliary ally of the AI on-board the self-driving car. Edge computing refers to having computer processing capabilities closer to the “edge” of wherever they are needed. Some have suggested that we ought to be place computer processing capabilities at the side of our roads and highways. An AI self-driving car could then tap into that added processing capability. This would be faster and presumably more reliable since the computing is sitting right there next to the road, versus a supercomputer that’s half-way around the world and being accessed via a cloud.
We might then opt to have both edge computing and the exascale supercomputer. The AI on-board the self-driving car might first try to tap into the edge computing nearby. The edge computing would then try to tap into the supercomputer. They all work together in a federated manner. The supercomputer might then do some work for the AI on-board that has been handed to it via the edge computing. The edge computing remains in-touch with the AI on-board the self-driving car as it zooms down the highway. The supercomputer responds to give its results to the edge computing, which in turn hands it over to the AI self-driving car.
Keep Focus on Driving Task When When Addressing Edge Issues
This arrangement might also relieve the AI self-driving car from having to deal with any vagaries or issues that arise between the edge computers and the supercomputer in the cloud. It is all hidden away from the AI on-board the self-driving car. This allows the on-board AI to continue focusing on the driving task.
I’m sure you can imagine how convoluted this can potentially become. If the AI on-board has opted to make use of the edge computing and the supercomputer, or even just the supercomputer, how long should it wait before deciding that things aren’t going to happen soon enough. Driving down a street and waiting to get back the analysis of the supercomputer, time is ticking, and the AI on-board has to be presumably keeping the car moving along. It’s conceivable that the analysis about the street scene and the potential fall-over tree wouldn’t be provided by the supercomputer until after the AI has already finished driving down the entire block.
A delayed response doesn’t though mean that the supercomputer processing was necessarily wasted. It could be that the AI self-driving car is going to drive back along that same street again, maybe on the way back out of town. Knowing about the potentially falling tree is still handy.
This brings us to another whole facet about the supercomputer aspects. So far, I’ve been focusing on a single self-driving car and its AI. The auto maker or tech firm that made the AI self-driving car would consider that they have an entire fleet of self-driving cars. For example, when providing a patch or other update, the auto maker or tech firm would use the cloud to push down via OTA the update to presumably all of the AI self-driving cars that they’ve sold or otherwise have on the roadways (that’s their fleet of cars).
If the supercomputer figured out that the tree might be ready to fall, it could update the entire fleet with that indication, posting something about the tree into the mapping portion of their on-board systems. It would not have to do this for all self-driving cars in the fleet and perhaps choose just those self-driving cars that might be nearby to the street that has the potentially falling tree.
Overall, the supercomputer could be aiding the ongoing Machine Learning aspects of the AI self-driving cars. Trying to get the processors on-board a self-driving car to do much Machine Learning is a likely exercise in futility because those processors aren’t either powerful enough or they are preoccupied (rightfully) with the driving tasks of the self-driving car. Some believe that self-driving cars will be running non-stop around the clock, as such, there might not be much idle or extra time available for the processors on-board to be tackling Machine Learning kinds of enhancements to the AI on-board.
Besides using a supercomputer to aid in the real-time or near real-time aspects of an AI self-driving car, there is also the potential to use the supercomputer for performing simulations that are pertinent to AI self-driving cars.
Before an AI self-driving car gets onto an actual roadway, hopefully the auto maker or tech firm has done a somewhat extensive and exhaustive number of simulations to try and ferret out whether the AI is really ready for driving on our public roads. These simulations can be rather complex if you want them to be as realistic as possible. The amount of processing required to do the simulations can be quite high and using a supercomputer would certainly be handy.
Waymo claims that they’ve done well-over 5 billion miles of simulated roadway testing via computer-based simulations, encompassing 10,000 virtual self-driving cars. The nice thing about a simulation is that you can just keep running it and running it. No need to stop. Now, there is of course a cost involved in using whatever computers or supercomputer is doing the simulation, and so that can be a barrier to be contended with in terms of how much simulations you are able to do. Generally, one might say the more the merrier, in terms of simulations, assuming that you are not just simulating the same thing over and over again.
Another variant on the use of simulations involves situations wherein the AI self-driving car has gotten into an incident and you want to do a post-mortem (or, you should have done what I call a pre-mortem). You can setup the simulator to try and recreate the situation that occurred with the AI self-driving car. It might then reveal aspects about how the accident occurred and ways to possibly prevent such accidents in the future. The results of the simulation could be used to then have the AI developers modify the AI systems and push a patch out to the AI on-board the self-driving cars.
Should an auto maker or tech firm go out and buy an exascale supercomputer? It’s hard to say whether the cost would be worthwhile to them in terms of their AI self-driving car efforts. They might already have a supercomputer that they use for the overall design of their cars, along with simulations associated with their cars (this is done with conventional cars too). Or, they might rent use of a supercomputer.
I had mentioned earlier that I would point out that you don’t necessarily need to buy a supercomputer to use one. For the research oriented supercomputers being developed at universities, they often allow requests to make use of the supercomputer, if there’s a bona fide reason to do so. The University of Manchester’s neuromorphic supercomputer can be used for doing research by others beyond just those directly involved in the existing efforts (you just need to file a request and if approved you can make use of it). IBM provides the “IBM Q Experience” wherein you can potentially get access to their quantum computer to tryout various programs on it.
If someone is really serious about using a supercomputer on a sustaining basis, the costs can begin to mount. You’d need to do an ROI (Return on Investment) analysis to figure out whether it is better to rent time or potentially buy one. As mentioned before, the outright cost of buying a supercomputer is pretty much only within reach for very large companies and governmental agencies. The good news is that with the emergence of the Internet and cloud computing, you can readily make use of tremendous computing power that was once very hard to reach and utilize.
For AI self-driving cars, there is value in having supercomputing that can be used for doing behind-the-scenes work such as performing simulations and doing data analyses, along with aiding in doing Machine Learning. A more advanced approach involves having the AI self-driving cars being able to tap into the supercomputing and use it in real-time or near real-time circumstances. This real-time aspect is quite tricky and requires a lot of checks-and-balances, including making sure that doing so does not inadvertently open up a security hole.
The amount of computing power put into an AI self-driving car is almost the same as having your very own “supercomputer” in your self-driving car (that’s if you are willing to consider that you will have nearly as much computer processing as the early days of supercomputers). You aren’t though going to have a modern-day exascale supercomputer inside your AI self-driving car (at least not yet!). As the joke goes, your AI self-driving car is able to compute so quickly that it can do an infinite loop in less than 5 seconds, and with the help of a true exascale supercomputer get it done in less than 1 second. Glad that I was able to resurrect that one.
Copyright 2018 Dr. Lance Eliot
This content is originally posted on AI Trends.