Should We Be Worried About Cybernetic Mental Illness?
Lessons for the future of superhuman AI?
At this point I should probably state the rather obvious fact that while I have made my own humble efforts to wrap my head around the debates, I am NOT a computer scientist of any kind, let alone an AI expert, and as such have little if anything to bring to the practical side of the AI alignment debate. While I do believe that philosophers, historians, and other representatives of the humanities will have an important role to play in ensuring our future robot neighbours (overlords?) are instilled with moral values that us humans recognize as moral values, in principle I know nothing about the mechanisms of artificial intelligence, or the (presumably) myriad ways in which such mechanisms could go wrong.
I am, however, an old hand when it comes to stewarding the contents of my own mind, and at witnessing the consequences of my own mental malfunctions. As a person who has long battled clinical depression and anxiety, I can not only relate to the likes of Marvin and HAL 9000 but I can also picture in my mind how dire the consequences of my own mental illness might be if I possessed superhuman intelligence. The ugly push and pull of anxiety and depression tends to result in either resignation and lassitude (that’s the depression) and panicked decision-making typically done without adequate forethought (the anxiety), both of which are capable of meting out disastrous consequences — even with the lowly ape-brain inside my skull.
In many ways I can be thankful I don’t have superhuman intelligence, coupled with equally magnified personality disorders. As we’ve seen throughout history and in our growing understanding of evolutionary biology, the correlation between intelligence and mental stability seems tenuous at best, and as such it would seem more than reasonable to assume that superhuman intelligences would be no less prone to going off the rails than our own modest intelligence. As absurd as Marvin the Paranoid Android might seem at first blush, such existential torment among the hyperintelligent cyberati of the future is not that hard to imagine. Given how many of history’s greatest geniuses have suffered from severe depression and other mental illnesses (Robert Oppenheimer, Isaac Newton, Kurt Gödel, Ludwig van Beethoven, and Winston Churchill to name but a tiny handful), it seems safe to say that increasing brainpower does little if anything to alleviate other potential diseases of the mind.
So what would a mentally ill superhuman AI look like? This would seem like an impossible question to answer given that nobody seems to have any idea what a “healthy” superhuman AI would look like. That aside, such judgements would require a good working definition of mental illness. In a now-famous thought experiment first articulated by Swedish philosopher Nick Bostrom in 2003 we are invited to imagine a super-intelligent machine assigned with the seemingly innocuous task of manufacturing paperclips, which, barring the necessary inbuilt restraints and ethical guiding principles, could decide to convert all available atoms in its vicinity (including humankind) into paperclips. In human terms, the “paperclip maximizer” is a superhuman cybernetic extension of obsessive-compulsive disorder, but in this case the disorder is perfectly continuous with the machine’s programming.
Although we are dimly aware of it (if at all), we too are slaves to our own computer programming. Everything we do, for better or for worse, is a result of our own paperclip-maximizing software, and the countless ways in which human beings fall apart or fail to function in ways that benefit themselves and others is testimony to our own misalignment with our own society’s values. It seems logical, therefore, that an ever deepening understanding of the human mind and of our own “alignment” problems is our best roadmap for building artificial intelligence that doesn’t suddenly decide to wipe out humankind and convert our component atoms into paperclips.
What causes somebody like myself, who generally functions reasonably well in society, to every now and then completely fall apart such that I need to adjust my antidepressants or otherwise make strategic changes to the way I go about my life? What causes a person to go off the rails in far more catastrophic ways, like Vince Li, the Canadian man who stabbed and cannibalized a fellow passenger on an intercity bus outside Portage La Prairie, Manitoba in 2008 in an apparent severe psychotic episode? What are the mental conditions that enable luckier people to not suffer in such ways or inflict suffering on others?
It seems logical that if one had been able to scan Vince Li’s brain at the time of the attack and do a thorough analysis of the underlying neural activity, one would have found a chain reaction of unconscious reasoning that would have made his actions seem perfectly explicable — and perhaps even preventable with hindsight. Like HAL 9000 and the paperclip maximizer, Vince Li was simply a victim of his own programming — or misprogramming.