The future of AI and whether it poses an existential threat to humanity is a question that has seeped into the public consciousness. Edge.org, the online public debating forum for prominent scientists, philosophers and other intellectuals, recently discussed this question, prompted by Elon Musk’s comments, which suggested that AI is perhaps the gravest threat that humanity faces. Musk’s remarks appear to have been motivated by reading Nick Bostrom’s recent monograph: Superintelligence: Paths, Dangers and Strategies.
Superintelligence is a serious, intellectually disorientating treatment of ideas, imagining the inevitable future when we are able to create an AGI (an artificial general intelligence). An AGI would be capable of successfully performing any task that a human can. Such a machine would thus be capable of recursive self-improvement (on a digital time scale) perhaps rapidly leading to an explosion in its own intelligence. An exponentially self-improving superintelligence, according to Bostrom, would pose a significant threat to human survival.
There’s a perverse thrill in reading a book that presages the possible extinction of the human species; it’s the same feeling of sublime terror and morbid fascination that accompanies discussions of nuclear war, climate change, asteroid impact and global pandemics. Appropriately, Bostrom heads up Oxford University’s Future of Humanity Institute, which looks at threats to humankind’s survival.
However, whereas asteroids, nuclear war and disease all seem to have a ready purchase on our everyday experience, superintelligent AI (of the sort that Bostrom envisages) seems almost unimaginably remote and incomprehensible. Depictions of an AI threat in film have largely imagined it on a human level: Terminator, AI, Blade Runner and recently, Ex Machina. The closest thing we have to what Bostrom imagines is probably HAL from 2001: A Space Odyssey, although even HAL is relatively easily vanquished by humanity.
There are probably two reasons for the lack of representation of superintelligence in popular culture. The first is that superintelligence is not inherently dramatic: depictions of conventional AI are mostly devices to examine human nature. By contrast, superintelligence represents something fundamentally different and inhuman. The second reason, equally important, is that it takes a work like Bostrom’s to expand our imaginative horizons and, in an intellectually coherent way, to discuss what form a superintelligence might plausibly take.
Bostrom isn’t modest about the scale of the challenge to control superintelligent AI. He describes it in the preface (with Fukuyamian chutzpah) as ‘the last challenge we will face’ establishing a suitably apocalyptic entry point. This is contrasted against early protestations of epistemic humility, namely, that ‘many of the points made in [the] book are probably wrong’. Nevertheless, Bostrom’s case is fastidiously argued: full of caveats and qualifications. And, broadly speaking, it is difficult to fault the logic or architecture of his argument. If we are able to construct an AGI which is capable of recursive self-improvement and it undergoes an intelligence explosion, it seems likely that Bostrom’s prognostications are correct.
The book necessarily touches upon many different fields: a section on embryo selection (for intelligence) is thought-provoking and made me re-evaluate some of my moral assumptions. Bostrom is very clear-headed on the uncomfortable topic of eugenics, for example. If the human subject and her capacities change irrevocably, suddenly, simple moral principles become far more problematic. Whether genetically engineered humans would become accepted by society is an open question. However, as Bostrom points out, IVF treatment was once seen as a moral aberrance and is now (excepting religious objection) a triviality.
Elsewhere, Superintelligence provides a corrective to human pride. Whilst the human brain is deemed by humans to be among the most complex objects in the universe, it seems all but inevitable that computers will supersede its capacities. Bostrom suggests that superintelligent machines will stand to us, by the same relation that we stand to a ‘beetle or a worm’. It’s a powerful analogy.
There are points, however, wherein it seems Bostrom is balancing improbability atop improbability: when discussing the potential future economic effects of superintelligence, for instance, or speculating about the rights of workers who might live in future simulated realities. In fact, Bostrom’s best arguments are voiced in general terms, such as his discussion of what machines should value, how they might acquire their values and if it’s possible to represent values in terms of computer code. These sorts of questions seem most amenable to Bostrom’s syncretic philosophical reasoning.
Elsewhere, his convergent objective conjecture is convincingly argued. This is the idea that there are certain general objectives that most intelligent beings would want in order to maximise their chances of successfully completing any tasks they were given. Bostrom suggests that resource acquisition, self-preservation and cognitive enhancement are amongst such universals that machines would want them, too. Without proper controls, superintelligent machines could decide that humanity was an obstacle to their resource acquisition; after all, more resources and energy would almost always benefit the efficiency with which an AI could achieve its goals. Many of Bostrom’s examples concern unintended consequences and the difficulty of predicting, changing or limiting the “behaviour” of a highly complex AI.
Bostrom’s discussion of perverse instantiation, where a machine interprets human commands according to their letter and not their spirit, provides a striking example of such unintended consequences. To illustrate perverse instantiation, he gives the example of an AI whose objective was to make humans smile. To begin with, this command might be instantiated in a fatuous way: the machine telling jokes, for example. However, as intelligence increased, the AI might find increasingly bizarre and direct routes to achieving the goal: paralysing human facial muscles into a constant beaming smile. Or, if muscle alteration were explicitly ruled out by humans, implanting electrodes into the human brain to indefinitely stimulate its pleasure centres. Furthermore, as preceding discussions in the book make clear, the machine might act in a duplicitous or otherwise secretive manner to ensure the attainment of its objectives. It might therefore conceal its plans, if it thought that humans would object to its method of achieving its goal.
What emerges from reading Superintelligence is the sheer number of different and intractable problems that should be solved or given serious consideration before the inception of a superintelligent AI. Bostrom’s achievement (demonstrating his own polymathic intelligence) is a delineation of a difficult subject into a coherent and well-ordered fashion. This subject now demands more investigation. If a superintelligent AI is developed in our lifetime, the questions concerning value and control must become our principal priorities now.