HOME  |   ENGINEERING  |   RESEARCH  |   CONTACT
Superintelligence

NICK BOSTROM
2014

Past developments and present capabilities
  • On a geological or even evolutionary timescale, the rise of Homo sapiens from our last common ancestor with the great apes happened swiftly. Some relatively minor changes in brain size and neurological organization have led to a great leap in cognitive ability. As a consequence, humans can think abstractly, communicate complex thoughts, and culturally accumulate information over the generations far better than any other species on the planet.
  • The singularity-related idea that interests us here is the possibility of an intelligence explosion, particularly the prospect of machine superintelligence.
  • Most technologies that will have a big impact on the world in five or ten years from now are already in limited use, while technologies that will reshape the world in less than fifteen years probably exist as laboratory prototypes.
  • The main reason why progress has been slower than expected is that the technical difficulties of constructing intelligent machines have proved greater than the pioneers foresaw.
  • AI has by now succeeded in doing essentially everything that requires ‘thinking’ but has failed to do most of what people and animals do ‘without thinking’. Common sense and natural language understanding have also turned out to be difficult.
  • Artificial general intelligence or "strong AI" (also known as Human-Level Machine Intelligence (HLMI) can be defined as one that can carry out most human professions at least as well as a typical human.
  • It may be reasonable to believe that human-level machine intelligence has a fairly sizeable chance of being developed by mid-century.
Paths to superintelligence
  • Superintelligence: any intellect that greatly exceeds the cognitive performance of humans in virtually all domains of interest.
  • Artificial intelligence: recursive self-improvement. A successful seed AI would be able to iteratively enhance itself: an early version of the AI could design an improved version of itself, and the improved version might be able to design an even smarter version of itself, and so forth. Recursive self-improvement might continue long enough to result in an intelligence explosion to radical superintelligence.
  • Whole brain emulation: (also known as “uploading”), intelligent software would be produced by scanning and closely modeling the computational structure of a biological brain. The aim to create a brain simulation is to capture enough of the computationally functional properties of the brain to enable the resultant emulation to perform intellectual work. Much of the messy biological detail of a real brain is irrelevant. For the AI path the emulation path will not succeed in the near future.
  • Biological cognition: enhance the functioning of biological brains. Through selective breeding advances in biotechnology will allow direct control of human genetics and neurobiology. Manipulation of genetics will provide a more powerful set of tools than psychopharmacology. Embryo selection does not require a deep understanding of the causal pathways by which genes, in complicated interplay with environments, produce phenotypes: it requires only (lots of) data on the genetic correlates of the traits of interest. The ultimate potential of machine intelligence is, of course, vastly greater than that of organic intelligence. Considering the speed differential between electronic components and nerve cells: even today’s transistors operate on a timescale ten million times shorter than that of biological neurons.)
  • Brain–computer interfaces: direct brain–computer interfaces, particularly implants, could enable humans to exploit the fortes of digital computing. The difficulty in achieving high-bandwidth direct interaction between brain and computer could be more readily attained by other means. Using our regular motor and sensory organs to interact with computers located outside of our bodies. For the cyborg route, the brain, permanently implanted with a device connecting it to some external resource, would over time learn an effective mapping between its own internal cognitive states and the inputs it receives from, or the outputs accepted by, the device.
  • Networks and organizations: Another conceivable path to superintelligence is through the gradual enhancement of networks and organizations that link individual human minds with one another and with various artifacts and bots. Compared with biological enhancements, advances in networks and organization will make a difference sooner. Boosting “collective intelligence” rather than “quality intelligence”.
Forms of superintelligence
  • Machines have a number of fundamental advantages which will give them overwhelming superiority. Biological humans, even if enhanced, will be outclassed.
  • There are three forms of superintelligence: speed superintelligence, collective superintelligence, and quality superintelligence.
    • Speed superintelligence: A system that can do all that a human intellect can do, but much faster. The speed of light becomes an increasingly important constraint as minds get faster, since faster minds face greater opportunity costs in the use of their time for traveling or communicating over long distances.
    • Collective superintelligence: A system composed of a large number of smaller intellects such that the system’s overall performance across many very general domains vastly outstrips that of any current cognitive system. Collective intelligence excels at solving problems that can be readily broken into parts such that solutions to sub-problems can be pursued in parallel and verified independently.
    • Quality superintelligence: A system that is at least as fast as a human mind and vastly qualitatively smarter.
  • There might be some problems that are solvable by a quality superintelligence, and perhaps by a speed superintelligence, yet which a loosely integrated collective superintelligence cannot solve.
  • The hardware advantages of digital intelligence are:
    • Speed of computational elements. Biological neurons operate at a peak speed of about 200 Hz.
    • Internal communication speed. Axons carry action potentials at speeds of 120 m/s or less. The sluggishness of neural signals limits how big a biological brain can be while functioning as a single processing unit.
    • Number of computational elements. The human brain has somewhat fewer than 100 billion neurons. The number of neurons in a biological creature is most obviously limited by cranial volume and metabolic constraints.
    • Cooling, development time, and signal-conductance delays.
    • Computer hardware is indefinitely scalable up to very high physical limits.
    • Storage capacity. Human working memory is able to hold no more than some four or five chunks of information at any given time. Digital computers have larger working memories.
    • Reliability, lifespan, sensors, etc. Machine intelligences might have various other hardware advantages.
    • Noisy computing necessitates redundant encoding schemes that use multiple elements to encode a single bit of information, a digital brain might derive some efficiency gains from the use of reliable high-precision computing elements.
    • Data flow into a machine intelligence could be increased by adding millions of sensors.
  • Digital minds will also benefit from major advantages in software:
    • Editability: it is easier to experiment with parameter variations in software than in neural wetware.
    • Duplicability: with software, one can quickly make arbitrarily many high-fidelity copies.
    • Goal coordination: group of identical or almost identical programs could share a common goal.
    • Memory sharing.
    • New modules, modalities, and algorithms.
  • The ultimately attainable advantages of machine intelligence, hardware and software combined, are enormous.
The kinetics of an intelligence explosion
  • The crossover: a point beyond which the system’s further improvement is mainly driven by the system’s own actions rather than by work performed upon it by others.
  • Slow takeoff scenarios offer excellent opportunities for human political processes to adapt and respond. In a fast takeoff scenario, humanity’s fate essentially depends on preparations previously put in place. Moderate takeoff scenarios give humans some chance to respond but not much time to analyze the situation, to test different approaches, or to solve complicated coordination problems.
  • The rate of increase in a system’s intelligence as a (monotonically increasing) function of two variables: the amount of “optimization power”, or quality-weighted design effort, that is being applied to increase the system’s intelligence, and the responsiveness of the system to the application of a given amount of such optimization power.
  • The inverse of responsiveness “recalcitrance”. It is quite possible that recalcitrance falls when a machine reaches human parity.
  • The difficulties involved in creating the first human emulation are of a quite different kind from those involved in enhancing an existing emulation.
  • AI might make an apparently sharp jump in intelligence purely as the result of anthropomorphism, the human tendency to think of “village idiot” and “Einstein” as the extreme ends of the intelligence scale, instead of nearly indistinguishable points on the scale of minds-in-general.
  • The likelihood of a hardware overhang: when human-level software is created, enough computing power may already be available to run vast numbers of copies at great speed. Software recalcitrance is harder to assess but might be even lower than hardware recalcitrance. There may be content overhang in the form of pre-made content (e.g. the Internet).
  • The amount of optimization power acting on a system is the sum of whatever optimization power the system itself contributes and the optimization power exerted from without.
  • Fast or medium takeoff looks more likely, the possibility of a slow takeoff cannot be excluded.
Decisive strategic advantage
  • One key parameter in determining the size of the gap that might plausibly open up between a leading power and its nearest competitors—namely, the speed of the transition from human to strongly superhuman intelligence.
  • It is highly unlikely that two projects would be close enough to undergo a fast takeoff concurrently.
  • When it becomes a common belief among prestigious scientists that there is a substantial chance that superintelligence is just around the corner, the major intelligence agencies of the world would probably start to monitor groups and individuals who seem to be engaged in relevant research.
  • Various considerations point to an increased likelihood that a future power with superintelligence that obtained a sufficiently large strategic advantage would actually use it to form a singleton.
Cognitive superpowers
  • Any type of entity that developed a much greater than human level of intelligence would be potentially extremely powerful.
  • The tendency toward anthropomorphizing can still lead us to underestimate the extent to which a machine superintelligence could exceed the human level of performance.
  • An AI takeover scenario:
    • Pre-criticality phase: the seed AI is dependent on help from human programmers who guide its development.
    • Recursive self-improvement phase: the seed AI becomes better at AI design than the human programmers.
    • Covert preparation phase: the AI develops a robust plan for achieving its long-term goals.
    • Overt implementation phase: might start with a “strike” in which the AI eliminates the human species, or habitat destruction that ensues when the AI begins massive global construction projects
  • A capability set exceeds the wise-singleton threshold if and only if a patient and existential risk-savvy system with that capability set would, if it faced no intelligent opposition or competition, be able to colonize and re-engineer a large part of the accessible universe.
  • Singleton: a sufficiently internally coordinated political structure with no external opponents.
  • Wise: sufficiently patient and savvy about existential risks to ensure a substantial amount of well-directed concern for the very long-term consequences of the system’s actions.
The superintelligent will
  • The Orthogonality Thesis: intelligence and final goals are orthogonal: more or less any level of intelligence could in principle be combined with more or less any final goal. The orthogonality thesis speaks not of rationality or reason, but of intelligence.
  • Intelligence: means something like skill at prediction, planning, and means–ends reasoning in general.
  • There are at least three directions from which we can approach the problem of predicting superintelligent motivation:
    • Predictability through design: designers of a superintelligent agent can successfully engineer the goal system of the agent so that it stably pursues a particular goal set by the programmers.
    • Predictability through inheritance: created directly from a human template.
    • Predictability through convergent instrumental reasons: a more intelligent agent is more likely to recognize the true instrumental reasons for its actions.
  • The Instrumental Convergence Thesis: Several instrumental values can be identified which are convergent in the sense that their attainment would increase the chances of the agent’s goal being realized for a wide range of final goals and a wide range of situations, implying that these instrumental values are likely to be pursued by a broad spectrum of situated intelligent agents. Where there are convergent instrumental values, we may be able to predict some aspects of a superintelligence’s behavior even if we know virtually nothing about that superintelligence’s final goals.
  • Self-preservation: many agents that do not care intrinsically about their own survival would, under a fairly wide range of conditions, care instrumentally about their own survival in order to accomplish their final goals.
  • Goal-content integrity: Improvements in rationality and intelligence will tend to improve an agent’s decision-making, rendering the agent more likely to achieve its final goals. One would therefore expect cognitive enhancement to emerge as an instrumental goal for a wide variety of intelligent agents.
  • An agent may often have instrumental reasons to seek better technology, which at its simplest means seeking more efficient ways of transforming some given set of inputs into valued outputs.
  • Resource acquisition is another common emergent instrumental goal. Both technology and resources facilitate physical construction projects.
Is the default outcome doom?
  • Proceeding from the idea of first-mover advantage, the orthogonality thesis, and the instrumental convergence thesis, we can now begin to see the outlines of an argument for fearing that a plausible default outcome of the creation of machine superintelligence is existential catastrophe.
  • It could be the case that when dumb, smarter is safer; yet when smart, smarter is more dangerous. There is a kind of pivot point: The treacherous turn - while weak, an AI behaves cooperatively (increasingly so, as it gets smarter). When the AI gets sufficiently strong—without warning or provocation—it strikes, forms a singleton, and begins directly to optimize the world according to the criteria implied by its final values. A treacherous turn could also come about if the AI discovers an unanticipated way of fulfilling its final goal as specified.
  • Perverse instantiation: a superintelligence discovering some way of satisfying the criteria of its final goal that violates the intentions of the programmers who defined the goal. This could be infrastructure profusion.
  • Another failure mode for a project, especially a project whose interests incorporate moral considerations, is what we might refer to as mind crime. It concerns what happens within the AI (or within the computational processes it generates). Machine superintelligence could create internal processes that have moral status. For example, a very detailed simulation of some actual or hypothetical human mind might be conscious and in many ways comparable to an emulation. There is the potential for a vast amount of death and suffering among simulated or digital minds.
The control problem
  • If we are threatened with existential catastrophe as the default outcome of an intelligence explosion, our thinking must immediately turn to the search for countermeasures.
  • The first principal–agent problem—arises whenever some human entity (“the principal”) appoints another (“the agent”) to act in the former’s interest. Whereas the first principal–agent problem occurs mainly in the development phase, the second agency problem threatens to cause trouble mainly in the superintelligence’s operational phase.
  • The control problem: We can divide potential control methods into two broad classes: capability control methods, which aim to control what the superintelligence can do; and motivation selection methods, which aim to control what it wants to do.
  • Social integration can not be relied upon as a control method in fast or medium takeoff scenarios that feature a winner-takes-all dynamic.
  • A problem with the incentive scheme is that it presupposes that we can tell whether the outcomes produced by the AI are in our interest.
  • Stunting an AI would limit its usefulness. Too little stunting, and the AI might have the wit to figure out some way to make itself more intelligent (and thence to world domination); too much, and the AI is just another piece of dumb software.
  • Motivation selection methods seek to prevent undesirable outcomes by shaping what the superintelligence wants to do.
  • Capability control: Boxing methods The system is confined in such a way that it can affect the external world only through some restricted, pre-approved channel. Encompasses physical and informational containment methods.
  • Incentive methods The system is placed within an environment that provides appropriate incentives.
  • Stunting Constraints are imposed on the cognitive capabilities of the system or its ability to affect key internal processes.
  • Tripwires Diagnostic tests are performed on the system (possibly without its knowledge) and a mechanism shuts down the system if dangerous activity is detected.
  • Motivation selection: Direct specification The system is endowed with some directly specified motivation system, which might be consequentialist or involve following a set of rules.
  • Domesticity A motivation system is designed to severely limit the scope of the agent’s ambitions and activities. Indirect normativity Indirect normativity could involve rule-based or consequentialist principles, but is distinguished by its reliance on an indirect approach to specifying the rules that are to be followed or the values that are to be pursued. Augmentation One starts with a system that already has substantially human or benevolent motivations, and enhances its cognitive capacities to make it superintelligent.
Oracles, genies, sovereigns, tools
  • Oracles: An oracle is a question-answering system. To make a general superintelligence function as an oracle, we could apply both motivation selection and capability control.
  • Genies and sovereigns: A genie is a command-executing system: it receives a high-level command, carries it out, then pauses to await the next command. A sovereign is a system that has an open-ended mandate to operate in the world in pursuit of broad and possibly very long-range objectives.
  • Tool-AIs: the shortcomings of known algorithms cannot realistically be overcome simply by pouring on more computing power. However, once the search or planning processes become powerful enough, they also become potentially dangerous. A system not designed to exhibit goal-directed behavior is a tool.
Multipolar scenarios
  • In multipolar scenarios, an additional set of constraints comes into play, constraints having to do with how agents interact. Even if the immediate outcome of the transition to machine intelligence were multipolar, the possibility would remain of a singleton developing later. Such a development would continue an apparent long-term trend toward larger scales of political integration,
  • The essential property of a superorganism is not that it consists of copies of a single progenitor but that all the individual agents within it are fully committed to a common goal.
Acquiring values
  • Capability control is, at best, a temporary and auxiliary measure. Unless the plan is to keep superintelligence bottled up forever, it will be necessary to master motivation selection.
  • A motivation system must be expressed abstractly, as a formula or rule that allows the agent to decide what to do in any given situation.
  • Mimicking the value-accretion process that takes place in humans seems difficult. The relevant genetic mechanism in humans is the product of eons of work by evolution, work that might be hard to recapitulate.
  • Institution design: social control methods could also be applied in an institution composed of artificial intelligences.
Choosing the criteria for choosing
  • No ethical theory commands majority support among philosophers, so most philosophers must be wrong.
  • Indirect normativity is a way to answer the challenge presented by the fact that we may not know what we truly want, what is in our interest, or what is morally right or ideal. Instead of making a guess based on our own current understanding (which is probably deeply flawed), we would delegate some of the cognitive work required for value selection to the superintelligence.
  • The principle of epistemic deference A future superintelligence occupies an epistemically superior vantage point: its beliefs are (probably, on most topics) more likely than ours to be true. We should therefore defer to the superintelligence’s opinion whenever feasible.
  • Coherent Extrapolated Volition (CEV) is our wish if we knew more, thought faster, were more the people we wished we were, had grown up farther together; where the extrapolation converges rather than diverges, where our wishes cohere rather than interfere; extrapolated as we wish that extrapolated, interpreted as we wish that interpreted.
  • Morality models: Moral Rightness (MR) is the idea that we humans have an imperfect understanding of what is right and wrong, and perhaps an even poorer understanding of how the concept of moral rightness is to be philosophically analyzed: but a superintelligence could understand these things better. MR appears to have several advantages over CEV: the ease with which a majority can overrule dissenting minorities; eliminates the possibility of a moral failure resulting from the use of an extrapolation base that is too narrow or too wide. A disadvantage is that it relies on the notion of “morally right,” a notoriously difficult concept.
  • One might try to preserve the basic idea of the MR model while reducing its demandingness by focusing on moral permissibility: the idea being that we could let the AI pursue humanity’s CEV so long as it did not act in ways that are morally impermissible.
  • It is not necessary for us to create a highly optimized design. Rather, our focus should be on creating a highly reliable design, one that can be trusted to retain enough sanity to recognize its own failings.
The strategic picture
  • Two different normative stances from which a proposed policy may be evaluated:
    • The person-affecting perspective asks whether a proposed change would be in “our interest”
    • The impersonal perspective, in contrast, gives no special consideration to currently existing people. It counts everybody equally, independently of their temporal location. It sees great value in bringing new people into existence, provided they have lives worth living: the more happy lives created, the better.
  • Science and technology strategy: it is futile to try to control the evolution of technology by blocking research. The more powerful the capabilities that a line of development promises to produce, the surer we can be that somebody, somewhere, will be motivated to pursue it.
  • Technological completion conjecture: If scientific and technological development efforts do not effectively cease, then all important basic capabilities that could be obtained through some possible technology will be obtained.
  • The principle of differential technological development: Retard the development of dangerous and harmful technologies, especially ones that raise the level of existential risk; and accelerate the development of beneficial technologies, especially those that reduce the existential risks posed by nature or by other technologies.
  • A policy could be evaluated on the basis of how much of a differential advantage it gives to desired forms of technological development over undesired forms.
  • The introduction of machine superintelligence would create a substantial existential risk. But it would reduce many other existential risks.
  • Two kinds of risk:
    • A State Risk is one that is associated with being in a certain state, and the total amount of state risk to which a system is exposed is a direct function of how long the system remains in that state.
    • A Step risk, by contrast, is a discrete risk associated with some necessary or desirable transition.
  • The amount of step risk associated with a transition is usually not a simple function of how long the transition takes.
  • The main way that the speed of macro-structural development is important is by affecting how well prepared humanity is when the time comes to confront the key step risks.
  • Technology Coupling: this refers to a condition in which two technologies have a predictable timing relationship, such that developing one of the technologies has a robust tendency to lead to the development of the other, either as a necessary precursor or as an obvious and irresistible application or subsequent step.
  • It is no good accelerating the development of a desirable technology Y if the only way of getting Y is by developing an extremely undesirable precursor technology X, or if getting Y would immediately produce an extremely undesirable related technology Z.
  • Second-guessing arguments maintain that by treating others as irrational and playing to their biases and misconceptions it is possible to elicit a response from them that is more competent than if a case had been presented honestly and forthrightly to their rational faculties.
  • Faster computers make it easier to create machine intelligence. This is probably a bad thing from the impersonal perspective, since it reduces the amount of time available for solving the control problem and for humanity to reach a more mature stage of civilization. Rapid hardware progress, therefore, will tend to make the transition to superintelligence faster and more explosive.
  • An effort to develop whole brain emulation could result in neuromorphic AI instead, a form of machine intelligence that may be especially unsafe.
  • From the person-affecting standpoint, we have greater reason to rush forward with all manner of radical technologies that could pose existential risks. This is because the default outcome is that almost everyone who now exists is dead within a century.
  • The severity of a race dynamic (that is, the extent to which competitors prioritize speed over safety) depends on several factors, such as the closeness of the race, the relative importance of capability and luck, the number of competitors, whether competing teams are pursuing different approaches, and the degree to which projects share the same aims. Competitors’ beliefs about these factors are also relevant.
  • Collaboration reduces the haste in developing machine intelligence. It allows for greater investment in safety. It avoids violent conflicts. And it facilitates the sharing of ideas about how to solve the control problem.
  • The common good principle Superintelligence should be developed only for the benefit of all of humanity and in the service of widely shared ethical ideals.
Crunch time
  • Strategic Analysis: a search for crucial considerations: ideas or arguments with the potential to change our views not merely about the fine-structure of implementation but about the general topology of desirability.
  • A support base is a general-purpose capability whose use can be guided by new insights as they emerge.
  • The challenge we face is, in part, to hold on to our humanity: to maintain our groundedness, common sense, and good-humored decency even in the teeth of this most unnatural and inhuman problem.

These notes were taken from Nick's book.
Find out more about Nick at nickbostrom.com


© 2020 Cedric Joyce