Open Thread March 21 - March 27, 2016

Gunnar_Zarncke

Overall, very sensible. I'll ignore minor quibbles (a 'strong AI' and a 'thinking machine' seem significantly different to me, since the former implies recursion but the latter doesn't) and focus on the main points of disagreement.

The related question I care more about, though, is: In practice, which goals are likely to be allied with which kinds and levels of intelligence, in reality? What goals will very, very smart minds, existing in the actual universe rather than the domains of abstract mathematics and philosophy, be most likely to aim for?

Goertzel goes on to question how likely Omohundro's basic AI drives are to be instantiated. Might an AI that doesn't care for value-preservation outcompete an AI that does?

Overall this seems very worth thinking about, but I think Goertzel draws the wrong conclusions. If we have a 'race-to-the-bottom' of competition between AGI, that suggests evolutionary pressures to me, and evolutionary pressures seem to be the motivation for expecting the AI drives in the first place. Yes, an AGI that doesn't have any sort of continuity impulses might be able to create a more powerful successor than an AGI that does have continuity impulses. But that's the start of the race, not the end of the race--any AGI that doesn't value continuity will edit itself out of existence pretty quickly, whereas those that do won't.

The nightmare scenario, of course, is an AGI that improves rapidly in the fastest direction possible, and then gets stuck somewhere unpleasant for humans.

And since I used the phrase "nightmare scenario," a major disagreement between Goertzel and Bostrom is over the role of uncertainty when it comes to danger. Much later, Goertzel brings up the proactionary principle and precautionary principle.

Bostrom's emotional argument, matching the precautionary approach, seems to be "things might go well, they might go poorly, because there's the possibility it could go poorly we must worry until we find a way to shut off that possibility."

Goertzel's emotional argument, matching the proactionary approach, seems to be "things might go well, they might go poorly, but why conclude that they will go poorly? We don't know enough." See, as an example, this quote:

Maybe AGIs that are sufficiently more advanced than humans will find some alternative playground that we humans can’t detect, and go there and leave us alone. We just can’t know, any more than ants can predict the odds that a human civilization, when moving onto a new continent, will destroy the ant colonies present there.

Earlier, Goertzel correctly observes that we're not going to make a random mind, we're going to make a mind in a specific way. But the Bostromian counterargument is that because we don't know where that specific way leads us, we don't have a guarantee that it's different from making a random mind! It would be nice if we knew where safe destinations were, and how to create pathways to funnel intelligences towards those destinations.

Which also seems relevant here:

Many of Bostrom’s hints are not especially subtle; e.g. the title of Chapter 8 is “Is the default outcome doom?” The answer given in the chapter is basically “maybe – we can’t rule it out; and here are some various ways doom might happen.” But the chapter isn’t titled “Is doom a plausible outcome?”, even though this is basically what the chapter argues.

I view the Bostromian approach as saying "safety comes from principles; if we don't follow those principles, disaster will result. We don't know what principles will actually lead to safety." Goertzel seems to respond with "yes, not following proper principles could lead to disaster, but we might end up accidentally following them as easily as we might end up accidentally violating them." Which is on as solid a logical foundation as Bostrom's position that things like the orthogonality thesis are true "in principle," and which seems more plausible or attractive seems to be almost more a question of personal psychology or reasoning style than it is evidence or argumentation.

There are massive unknowns here, but it doesn’t seem sensible to simply assume that, for all these non-superintelligence threats, defenses will outpace offenses. It feels to me like Bostrom – in his choices of what to pay attention to, and his various phrasings throughout the book – downplays the risks of other advanced technologies and over-emphasizes the potential risks of AGI. Actually there are massive unknowns all around, and the hypothesis that advanced AGI may save humanity from risks posed by bad people making dangerous use of other technologies is much more plausible than Bostrom makes it seem.

This is, I think, a fairly common position--a decision on whether to risk the world on AGI should be made knowing that there are other background risks that the AGI might materially diminish. (Supposing one estimates that a particular AGI project is a 3 in a thousand chance of existential collapse, one still has work to do in determining whether or not that's a lower or higher risk than not doing that particular AGI project.)

I don't see any reason yet to think Bostrom's ability to estimate probabilities in this area are any better than Goertzel's, or vice versa; I think that the more AI safety research we do, the easier it is to pull the trigger on an AGI project, and the sooner we can do so. I agree with Goertzel that it's not obvious that AI research slowdown is desirable, let alone possible, but it is obvious to me that AI safety research speedup is desirable.

I think Goertzel overstates the benefit of open AI development, but agree with him that Bostrom and Yudkowsky overstate the benefit of closed AI development.

I haven't read about open-ended intelligence yet. My suspicion, from Goertzel's description of it, is that I'll find it less satisfying than the reward-based view. My personal model of intelligence is much more inspired by control theory. The following statement, for example, strikes me as somewhat bizarre:

But I differ from them in suspecting that these advances will also bring us beyond the whole paradigm of optimization.

I don't see how you get rid of optimization without also getting rid of preferences, or choosing a very narrow definition of 'optimization.'

I think that there's something of a communication barrier between the Goertzelian approach of "development" and the Yudkowskyian approach of "value preservation." On the surface, the two of those appear to contradict each other--a child who preserves their values will never become an adult--but I think the synthesis of the two is the correct approach--value preservation is what it looks like when a child matures into an adult, rather than into a tumor. If value is fragile, most processes of change are not the sort of maturation that we want, but are instead the sort of degeneration that we don't want; and it's important to learn the difference between them and make sure that we can engineer that difference.

Biology has already (mostly) done that work for us, and so makes it look easy--which the Bostromian camp thinks is a dangerous illusion.