Why Our Predictions Are Failing Us
Here are my notes on the introduction to Nate Silver’s The Signal and the Noise. If you have any interest in becoming a better statistician, data scientist, predictor of any sort, or if you’re anxious to know why humans are so good at making bad predictions, be sure to read the book. If you’ll never get through the whole thing then maybe you can pull some value from this:
More Information, More Problems
For the entirety of our species’ history the small amounts of knowledge that existed were available only to a few. Books were expensive to produce, were copied by hand, few could read them. They were prone to both major errors and minor errors that would have major consequences. The printing press changed this. In terms of today’s dollars, the press lowered the cost of copying books from approximately $20,000 to about $70 each. At this point, the errors, mis-prints, and different interpretations of religious and philosophical texts would diffuse easily throughout the western world. The small errors that resulted from different printings created huge rifts among religious and political groups. Revolutionary ideas that broke with tradition did not stay tacked to the church door but were instead copied and distributed at best-seller rates. Zealots from different factions now had written proof to back their claims. When presented with so much information we did what we’re doing now — we engaged with it selectively. We upheld the works that adhered to our beliefs and opposed contradicting ideas. This is why before the press brought about The Enlightenment it ignited 300 years of bloody wars and violent inquisitions in Europe.
Among the chaos and confusion that was gripping the western world there were signs of improvement. Shakespeare was producing his plays, many of which turned on the idea that we had some control over our fate. They introduced the idea of omens and the if-only-they’d-headed-the-signs retrospective. The concept that we were masters of our fate was gaining favor and we began looking for ways to predict and avoid the ill fates that fell the characters in Shakespeare’s work. Prediction was impossible, but foresight and preparedness through diligence was a good substitute. More concretely, Galileo and other scientists, mathematicians, and scholars were sharing their ideas with one another and with increasingly larger segments of the public. The exchange of new thoughts catalyzed The Enlightenment, and the new attitude of self-determination the Industrial Revolution. This was the ultimate legacy of the printing press but we shouldn’t forget the violent age that preceded it. Especially as we continue to forge into this new age of mass information abundance.
The Productivity Paradox
Despite the proliferation of computing power and theory, the average input per US patent filing in the 25 years after 1963 doubled from approximately $1.5 million to $3 million. The issue was dubbed the Productivity Paradox and we now recognize it as applying too much theory to too little data. It’s gospel that more and better data trumps algorithms, but it wasn’t for the first 30 years of the information age. The astounding promises of predictive ability that computers would give us came to naught.
The bright side was that we learned the lessons more quickly than before, though not quickly enough for some companies or government programs to remain in operation. Our research dollars were shifted towards more pragmatic things. Computers improved our daily lives if they couldn’t yet build the science fiction future promised in the ’70s and ’80’s. This second iteration of the information abundance problem was more peaceful but still quite costly.
The Promise and Pitfalls of “Big Data”
Now we have Big Data, and it is not a cure-all either. It’s analogous to computers in the 1970s — a tool. The data and the numbers might be there but they cannot speak for themselves. Some data-driven predictions have succeeded, like in Silver’s political and sports cases, but most fail. For instance, despite plenty of information and access to more, we did not see the 9/11 attacks coming. The improvements in meteorological prediction have been slow and costly, especially to the esteem and reputation of weathermen. Prediction failed again in 2008 and in the years preceding the financial crisis. Political scientists, relying on the best practices and best data, are wrong a whopping 15% of the time when they predict a 0% chance of an event occurring. Japanese engineers built Fukushima to withstand an 8.6 magnitude earthquake, believing that anything stronger was an impossibility. A 9.1 magnitude quake hit Japan in March 2011. There are plenty of other cases in plenty of other fields where our new troves of data are not yielding invulnerable predictions
Why the Future Shocks Us
We are wired to find patterns, not to discern truths from insane amounts of data. This stone-age strength is an information-age weakness. The sectarianism that the printing press produced is being replicated by the internet and by the partisanship media sources. It’s a case of more information but less knowledge. It isn’t difficult to see that the modern information infrastructure lowers the cost of entry for worse ideas. Their propagation is almost a sure thing now. And because we cannot hold all the requisite context for proper processing, or hold the wrong or false context, we are constantly surprised by the events of the day.
The Prediction Solution
Apart from all the predictions we know to have been wrong, there are many that we cannot soon test soon or test at all. For instance, it’s difficult to predict the value or detriment of long term social programs or lack-there-of. Given our record though, we have little reason to be optimistic about many of these already-made decisions.
Nevertheless, prediction is useful and necessary, so we have little choice but to get better at it. This will require a change away from the craving of discrete outcomes to thinking in terms of probabilities and uncertainties.