This is part of a weekly reading group on Nick Bostrom's book, Superintelligence. For more information about the group, and an index of posts so far see the announcement post. For the schedule of future topics, see MIRI's reading guide.
Welcome. This week we discuss the fifth section in the reading guide: Forms of superintelligence. This corresponds to Chapter 3, on different ways in which an intelligence can be super.
This post summarizes the section, and offers a few relevant notes, and ideas for further investigation. Some of my own thoughts and questions for discussion are in the comments.
There is no need to proceed in order through this post, or to look at everything. Feel free to jump straight to the discussion. Where applicable and I remember, page numbers indicate the rough part of the chapter that is most related (not necessarily that the chapter is being cited for the specific claim).
Reading: Chapter 3 (p52-61)
Summary
- A speed superintelligence could do what a human does, but faster. This would make the outside world seem very slow to it. It might cope with this partially by being very tiny, or virtual. (p53)
- A collective superintelligence is composed of smaller intellects, interacting in some way. It is especially good at tasks that can be broken into parts and completed in parallel. It can be improved by adding more smaller intellects, or by organizing them better. (p54)
- A quality superintelligence can carry out intellectual tasks that humans just can't in practice, without necessarily being better or faster at the things humans can do. This can be understood by analogy with the difference between other animals and humans, or the difference between humans with and without certain cognitive capabilities. (p56-7)
- These different kinds of superintelligence are especially good at different kinds of tasks. We might say they have different 'direct reach'. Ultimately they could all lead to one another, so can indirectly carry out the same tasks. We might say their 'indirect reach' is the same. (p58-9)
- We don't know how smart it is possible for a biological or a synthetic intelligence to be. Nonetheless we can be confident that synthetic entities can be much more intelligent than biological entities.
- Digital intelligences would have better hardware: they would be made of components ten million times faster than neurons; the components could communicate about two million times faster than neurons can; they could use many more components while our brains are constrained to our skulls; it looks like better memory should be feasible; and they could be built to be more reliable, long-lasting, flexible, and well suited to their environment.
- Digital intelligences would have better software: they could be cheaply and non-destructively 'edited'; they could be duplicated arbitrarily; they could have well aligned goals as a result of this duplication; they could share memories (at least for some forms of AI); and they could have powerful dedicated software (like our vision system) for domains where we have to rely on slow general reasoning.
Notes
- This chapter is about different kinds of superintelligent entities that could exist. I like to think about the closely related question, 'what kinds of better can intelligence be?' You can be a better baker if you can bake a cake faster, or bake more cakes, or bake better cakes. Similarly, a system can become more intelligent if it can do the same intelligent things faster, or if it does things that are qualitatively more intelligent. (Collective intelligence seems somewhat different, in that it appears to be a means to be faster or able to do better things, though it may have benefits in dimensions I'm not thinking of.) I think the chapter is getting at different ways intelligence can be better rather than 'forms' in general, which might vary on many other dimensions (e.g. emulation vs AI, goal directed vs. reflexive, nice vs. nasty).
- Some of the hardware and software advantages mentioned would be pretty transformative on their own. If you haven't before, consider taking a moment to think about what the world would be like if people could be cheaply and perfectly replicated, with their skills intact. Or if people could live arbitrarily long by replacing worn components.
- The main differences between increasing intelligence of a system via speed and via collectiveness seem to be: (1) the 'collective' route requires that you can break up the task into parallelizable subtasks, (2) it generally has larger costs from communication between those subparts, and (3) it can't produce a single unit as fast as a comparable 'speed-based' system. This suggests that anything a collective intelligence can do, a comparable speed intelligence can do at least as well. One counterexample to this I can think of is that often groups include people with a diversity of knowledge and approaches, and so the group can do a lot more productive thinking than a single person could. It seems wrong to count this as a virtue of collective intelligence in general however, since you could also have a single fast system with varied approaches at different times.
- For each task, we can think of curves for how performance increases as we increase intelligence in these different ways. For instance, take the task of finding a fact on the internet quickly. It seems to me that a person who ran at 10x speed would get the figure 10x faster. Ten times as many people working in parallel would do it only a bit faster than one, depending on the variance of their individual performance, and whether they found some clever way to complement each other. It's not obvious how to multiply qualitative intelligence by a particular factor, especially as there are different ways to improve the quality of a system. It also seems non-obvious to me how search speed would scale with a particular measure such as IQ.
- How much more intelligent do human systems get as we add more humans? I can't find much of an answer, but people have investigated the effect of things like team size, city size, and scientific collaboration on various measures of productivity.
- The things we might think of as collective intelligences - e.g. companies, governments, academic fields - seem notable to me for being slow-moving, relative to their components. If someone were to steal some chewing gum from Target, Target can respond in the sense that an employee can try to stop them. And this is no slower than an individual human acting to stop their chewing gum from being taken. However it also doesn't involve any extra problem-solving from the organization - to the extent that the organization's intelligence goes into the issue, it has to have already done the thinking ahead of time. Target was probably much smarter than an individual human about setting up the procedures and the incentives to have a person there ready to respond quickly and effectively, but that might have happened over months or years.
In-depth investigations
If you are particularly interested in these topics, and want to do further research, these are a few plausible directions, some inspired by Luke Muehlhauser's list, which contains many suggestions related to parts of Superintelligence. These projects could be attempted at various levels of depth.
- Produce improved measures of (substrate-independent) general intelligence. Build on the ideas of Legg, Yudkowsky, Goertzel, Hernandez-Orallo & Dowe, etc. Differentiate intelligence quality from speed.
- List some feasible but non-realized cognitive talents for humans, and explore what could be achieved if they were given to some humans.
- List and examine some types of problems better solved by a speed superintelligence than by a collective superintelligence, and vice versa. Also, what are the returns on “more brains applied to the problem” (collective intelligence) for various problems? If there were merely a huge number of human-level agents added to the economy, how much would it speed up economic growth, technological progress, or other relevant metrics? If there were a large number of researchers added to the field of AI, how would it change progress?
- How does intelligence quality improve performance on economically relevant tasks?
How to proceed
This has been a collection of notes on the chapter. The most important part of the reading group though is discussion, which is in the comments section. I pose some questions for you there, and I invite you to add your own. Please remember that this group contains a variety of levels of expertise: if a line of discussion seems too basic or too incomprehensible, look around for one that suits you better!
Next week, we will talk about 'intelligence explosion kinetics', a topic at the center of much contemporary debate over the arrival of machine intelligence. To prepare, read Chapter 4, The kinetics of an intelligence explosion (p62-77). The discussion will go live at 6pm Pacific time next Monday 20 October. Sign up to be notified here.
Three types of information in the brain (and perhaps other platforms), and (coming soon) why we should care
Before I make some remarks, I would recommend Leonard Susskind’s (for those who don’t know him already – though most folks in here probably do -- he is a physicist at the Stanford Institute for Theoretical Physics) very accessible 55 min YouTube presentation called “The World as Hologram.” It is not as corny as it might sound, but is a lecture on the indestructibility of information, black holes (which is a convenient lodestone for him to discuss the physics of information and his debate with Hawking), types of information, and so on. He makes the seemingly point that, “…when one rules out the impossible, then what is left, however improbable, is the best candidate for truth.”
One interesting side point that comes out is his take on why computers that are more powerful have to shed more “heat”. Here is the talk: http://youtu.be/2DIl3Hfh9tY
Okay, my own remarks. One of my two or three favorite ways to “bring people in” to the mind-body problem, is with some of the ideas I am now presenting. This will be in skeleton form tonight and I will come back and flesh it out more in coming days. (I promised last night to get something up tonight on this topic, and in case anyone cares and came back, I didn’t want to have nothing. I actually have a large piece of theory I am building around some of this, but for now, just the three kinds of information, in abbreviated form.
Type One information is the sort dealt with, referred to, and treated in thermodynamics and entropy discussions. This is dealt with analytically in Newton’s Second Law of Thermodynamics. Here is one small start, but most will know it: en.wikipedia.org/wiki/Second_law_of_thermodynamics
Heat, energy, information, the changing logical positions within state spaces of entities or systems of entities, all belong to what I am calling category one information in the brain. We can also call this “physical” information. The brain is pumped -- not closed -- with physical information, and emits physical information as well.
Note that there is no semantic, referential, externally cashed-out content, defined for physical, thermodynamic information, qua physical information. It is - though possibly thermodynamically open an otherwise closed universe of discourse, needing nothing logically or ontologically external to analytically characterize it.
Type Two information in the brain (please assign no significance to my ordering, just yet) is functional. It is a carrier, or mediator, of causal properties, in functionally larger physical ensembles, like canonical brain processes. The “information” I direct attention to here must be consistent with (i.e. not violate principles of) Category One informational flow, phase space transitions, etc., in the context of the system, but we cannot derive Category Two information content (causal loop xyz doing pqr) from dynamical Category One data descriptions themselves.
In particular, imagine that we deny the previous proposition. We would need either an isomorphism from Cat One to Cat Two, or at least an “onto” function from Cat One to Cat Two (hope I wrote that right, it’s late.) Clearly, Cat one configurations to Cat Two configurations are many-many, not isomorphic, nor many to one. (And one to many transformations from cat one sets to cat two sets, would be intuitively unsatisfactory if we were trying to build an “identity” or transform to derive C2 specifics, from C1 specifics .
It would resemble replacing type-type identity with token-token identity, jettisoning both sides of the Leibniz Law bi-conditional (“Identity of indiscernibles” and “Indiscernibility of Identicals” --- applied with suitable limits so as not to sneak anything in by misusing sortal ranges of predicates or making category errors in the predications.)
Well, this is a stub, and because of my sketchy presentation, this might be getting opaque, so let me move on to the next information type, just to get all three out.
Type Three information, is semantic, or intentional content, information. If I am visualizing very vibrantly a theta symbol, the intentional content of my mental state is the theta symbol on whatever background I visualize it against. A physical state of, canonically, Type Two information – which is a candidate, in a particular case, to be the substrate-instantiation or substrate-realization of this bundle of Type Three information (probably at least three areas of my brain, frequency coupled and phase offset locked, until a break in my concentration occurs) is also occuring.
A liberal and loose way of describing Type Three info (that will raise some eyebrows because it has baggage, so I use it only under duress: temporary poverty of time and the late hour, to help make the notion easy to spot) is that a Type Three information instance is a “representation” of some element, concept, or sensible experience of the “perceived” ontology (of necessity, a virtual, constructed ontology, in fact, but for this sentence, I take no position about the status of this “perceived”, ostensible virtual object or state of affairs.)
The key idea I would like to encourage people to think about is whether the three categories of information are (a) legitimate categories, and mainly (b) whether they are collapsible, inter-translatable, or are just convenient shorthand level-of-description changes. I hope the reader will see, on the contrary, that one or more of them are NOT reducible to a lower one, and that this has lessons about mind-substrate relationships that point out necessary conceptual revisions—and also opportunities for theoretical progress.
It seems to me that reducing Cat Two to Cat One is problematic, and reducing Cat 3 to Cat 2 is problematic, given the usual standards of “identity” used in logic (e.g. i. Leibniz Law; ii. modal logic’s notions of identity across possible worlds, and so on.)
Okay, I need to clean this up. It is just a stub. Those interested should come back and see it better written, and expanded to include replies to what I know are expected objections, questions, etc., C2 and C3 probably sound like the "same old thing" the m-b problem about experience vs neural correlate. Not quite. I am trying to get at something additional, here. Hard without diagrams.
Also, I have to present much of this without any context… like presenting a randomly selected lecture from some course, without building up the foundational layers. (That is why I am putting together a YouTube channel of my own, to go from scratch, to something like this, after about 6 hours of presentation… then on to a theory of which this is one puzzle piece.
Of course, we are here to discuss Bostrom’s ideas, but this “three information type” idea, less clumsily expressed, does tie straightforwardly to the question of indirect reach, and “kinds of better” that different superintelligences can embrace.
Unfortunately I will have to establish that conceptual link when I come back and clean this up, since it is getting so late. Thanks to those who read this far...