shokwave comments on Ben Goertzel: The Singularity Institute's Scary Idea (and Why I Don't Buy It) - Less Wrong

32 Post author: ciphergoth 30 October 2010 09:31AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (432)

You are viewing a single comment's thread.

Comment author: shokwave 31 October 2010 08:15:34AM *  3 points [-]

I would like to explore Ben's reasons for rejecting the premises of the argument.

I think the first of the above points is reasonably plausible

He offers the possibility that intelligence might cause or imply empathy; I feel that although we see that connection when we look at all of Earth's creatures, correlation doesn't imply causation, so that (intelligence AND empathy) doesn't mean (intelligence IMPLIES empathy) - it probably means (evolution IMPLIES intelligence AND empathy) and we aren't using natural selection to build an AI.

I doubt human value is particularly fragile.

He makes the point that human values have robustly changed many times, and will probably continue to change in coordination with AGI. Human value is not fragile on the timescales we deal with as humans; our values have indeed changed since, say, Victorian times. But that took generations - most value change will take generations, because humans are (understandably) reserved about modifying their values. The timescales that AGIs will be dealing with are, on the low end, weeks. (An AGI with access to a microchip manufacturing plant, say). I can't see a plausible AGI that enacts changes at generational speed. So, yes, our values are robust, in the sense that a mountain is robust to weather patterns - but not robust to falling into the sun.

I think a hard takeoff is possible ... it's very unlikely to occur until we have an AGI system that has very obviously demonstrated general intelligence

I think he is accurate in this assessment.

I think the path to this "hard takeoff enabling" level of general intelligence is going to be somewhat gradual

Again, accurate. The path to nuclear fission was gradual over many years, but the reaction itself (the takeoff) could have irradiated a university in hours. His position appears to be that he think a hard takeoff is possible, but that we'll have warning signs and a deeper understanding of the AGI before it happens ... well, a scientist from the Manhattan Project in Japan during WWII would have a deeper understanding of the features of a nuclear explosion, but the defense against it is STILL not being in Hiroshima. I don't think more knowledge about the issue is going to significantly change the solution. We have reached the diminishing returns level of knowledge about AI with respect to decreasing existential risk of said AI.

pointing out that something scary is possible, is a very different thing from having an argument that it's likely.

This is just wrong. The only difference between possible and likely is the probability distribution, and we know how to reason with probability distributions. If Ben has an argument for why the probability distribution is SO small that even multiplied by 'hard takeoff, universe is paperclips, end of existence' comes out below the "AGI without Friendly" route, well, he should articulate it and provide evidence. Without certainty that the chances are very low, he should accept the Scary Idea.

I'm a lot more worried about nasty humans taking early-stage AGIs and using them for massive destruction

Preventing abuse of AGI and preventing uFAI takeoff scenarios are not mutually exclusive, you can and should attempt to prevent both.

I'm also quite unconvinced that "provably safe" AGI is even feasible.

Mostly claims and arguments that a proof of friendliness is impossible. In order to argue that "provably safe AGI isn't feasible, we should instead develop unpredictable but I-don't-see-the-danger AGI" you pretty much need an Incompleteness Proof; that there IS NO proof of friendliness, not that there isn't one yet. If you believe that friendliness proof is probably impossible, you shouldn't work on AGI at all, instead of working on possibly-unfriendly AI. That Ben came to the conclusion that work should continue, rather than halt entirely, suggests he is motivated to justify his own work rather than engage with his beliefs about the Scary Idea.

I just don't buy the Scary Idea.

The Scary Idea that Ben outlined is "The stakes are so high that 'unlikely' is not good enough; we need 'surer than we've ever been'. Anything less is too dangerous." and his refutations have amounted to "I don't know for sure, but I don't think it's likely".

In essence, he hasn't refuted the argument, but instead made it scarier. If AI developers can see "stakes are very high" and "there is a small chance", and argue against "the stakes are high enough that a small chance is too much chance", then uFAI is that much more likely.

Comment author: timtyler 31 October 2010 09:27:14AM 0 points [-]

pointing out that something scary is possible, is a very different thing from having an argument that it's likely.

This is just wrong.

It does seem like a pretty different thing to me. A lot of things are possible, but only a few are likely.

Comment author: shokwave 31 October 2010 09:40:10AM 3 points [-]

Yep. The rule is not "bet on what is most likely" but rather "bet on positive expected values" and if something is possible and has a large value, then if the math comes out in favour, you ought to bet on it. Goertzel is making the argument that since it's unlikely, we should not bet on it.

Comment author: timtyler 31 October 2010 09:55:28AM *  -1 points [-]

He doesn't seem to be. Here's the context:

Yes, you may argue: the Scary Idea hasn't been rigorously shown to be true… but what if it IS true?

OK but ... pointing out that something scary is possible, is a very different thing from having an argument that it's likely.

The Scary Idea is certainly something to keep in mind, but there are also many other risks to keep in mind, some much more definite and palpable. [...]

He doesn't seem to be making the argument you describe anywhere near the cited quote.

Comment author: shokwave 31 October 2010 11:05:33AM *  1 point [-]

The Scary Idea is certainly something to keep in mind

He doesn't seem to be making the argument you describe anywhere near the cited quote.

Say your options are: Stop and develop Friendly theory, or continue developing AI. In the second option the utility of A, continuing AI development, is one utilon, and B, the end of the existence of at least humanity and possibly the whole universe, is negative one million utilons. The Scary Idea in this context is that the probability of B is 1%, so that the utility of the second option is negative 9999 utilons. If Ben 'keeps it in mind', such that the probability that the Scary Idea is right is 1% (reasonable - only one of his rejections has to be right to knock out one premise, and we only need to knock out one premise to bring the Scary Idea down), then Ben's expected utility is now negative 99 utilons.

I conclude that he isn't keeping the Scary Idea in mind. His whole post is about not accepting the Scary Idea; for that phrase ("pointing out that something scary is possible, is a very different thing from having an argument that it's likely") to support his position and not work against him, he would have to be rejecting the premises purely on their low probability, without considering the expected value.

Hence, the argument that since it's unlikely, we should not bet on it.

Edit for clarity: A and B are the exclusive, exhaustive outcomes of continuing AI development. Stopping to develop Friendly theory has zero utilons.

Comment author: Vaniver 01 November 2010 04:01:21AM 2 points [-]

Ah, Pascal's wager. And here I thought that I wouldn't be seeing it anymore, after I started hanging out with atheists.

Comment author: ata 01 November 2010 05:09:33AM *  13 points [-]

The problem with Pascal's Wager isn't that it's a Wager. The problem with Pascal's Wager and Pascal's Mugging (its analogue in finite expected utility maximization), as near as I can tell, is that if you do an expected utility calculation including one outcome that has a tiny probability but enough utility or disutility to weigh heavily in the calculation anyway, you need to include every possible outcome that is around that level of improbability, or you are privileging a hypothesis and are probably making the calculation less accurate in the process. If you actually are including every other hypothesis at that level of improbability, for instance if you are a galaxy-sized Bayesian superintelligence who, for reasons beyond my mortal mind's comprehension, has decided not to just dismiss those tiny possibilities a priori anyway, then it still shouldn't be any problem; at that point, you should get a sane, nearly-optimal answer.

So, is this situation a Pascal's Mugging? I don't think it is. 1% isn't at the same level of ridiculous improbability as, say, Yahweh existing, or the mugger's threat being true. 1% chances actually happen pretty often, so it's both possible and prudent to take them into account when a lot is at stake. The only extra thing to consider is that the remaining 99% should be broken down into smaller possibilities; saying "1% humanity ends, 99% everything goes fine" is unjustified. There are probably some other possible outcomes that are also around 1%, and perhaps a bit lower, and they should be taken into account individually.

Comment author: PhilGoetz 03 November 2010 05:23:34PM 4 points [-]

Excellent analysis. In fairness to Pascal, I think his available evidence at the time should have lead him to attribute more than a 1% chance to the Christian Bible being true.

Comment author: orthonormal 06 November 2010 01:15:10AM *  0 points [-]

Indeed. Before Darwin, design was a respectable-to-overwhelming hypothesis for the order of the natural world.

ETA: On second thought, that's too strong of a claim. See replies below.

Comment author: ata 06 November 2010 01:51:55AM *  1 point [-]

Is that true? If we went back in time to before Darwin and gave a not-already-religious person (if we could find one) a thorough rationality lesson — enough to skillfully weigh the probabilities of competing hypotheses (including enough about cognitive science to know why intelligence and intentionality are not black boxes, must carry serious complexity penalties, and need to make specific advance predictions instead of just being invoked as "God wills it" retroactively about only the things that do happen), but not quite enough that they'd end up just inventing the theory of evolution themselves — wouldn't they conclude, even in the absence of any specific alternatives, that design was a non-explanation, a mysterious answer to a mysterious question? And even imagining that we managed to come up with a technical model of an intelligent designer, specifying in advance the structure of its mind and its goal system, could it actually compress the pre-Darwin knowledge about the natural world more than slightly?

Comment author: komponisto 06 November 2010 02:35:57PM 0 points [-]

On the other hand, there wasn't a whole lot of honest, systematic searching for other hypotheses before Darwin either.

Comment author: PhilGoetz 07 November 2010 02:48:45PM *  0 points [-]

I didn't really mean because of Darwin. Design is not a competitor to the theory of evolution. Evolution explains how complexity can increase. Design [ADDED: as an alternative to evolution] does not; it requires a designer that is assumed to be more complicated than the things it designs. Design explains nothing.

Comment author: Vaniver 01 November 2010 07:24:24PM 0 points [-]

I agree with your analysis, though it's not clear to me what you think of the 1% estimate. I think the 1% estimate is probably two to three orders of magnitude too high and I think the cost of the Scary Idea belief is structured as both a finite loss and an infinite loss, which complicates the analysis in a way not considered. (i.e. the error you see with a Pascal's mugging is present here.)

For example, I am not particularly tied to a human future. I would be willing to create an AGI in any of the following three situations, ordered from most preferred to least: 1) it is friendly to humans, and humans and it benefit from each other; 2) it considers humans a threat, and destroys all of them except for me and a few tame humans; I spend the rest of my days growing cabbage with my hands; 3) it considers all humans a threat, and destroys them all, including me.

A problem with believing the Scary Idea is it makes it more probable that I beat you to making an AGI; particularly with existential risks, caution can increase your chance of losing. (One cautious way to deal with global warming, for example, is to wait and see what happens.)

So, the Scary Idea as I've seen it presented definitely privileges a hypothesis in a troubling way.

Comment author: orthonormal 06 November 2010 01:09:59AM *  3 points [-]

I think you're making the unwarranted assumption that in scenario (3), the AGI then goes on to do interesting and wonderful things, as opposed to (say) turning the galaxy into a vast computer to calculate digits of pi until the heat death of the universe stops it.

You don't even see such things as a possibility, but if you programmed an AGI with the goal of calculating pi, and it started getting smarter... well, the part of our thought-algorithm that says "seriously, it would be stupid to devote so much to doing that" won't be in the AI's goal system unless we've intentionally put something there that includes it.

Comment author: Vaniver 06 November 2010 01:49:35AM *  -1 points [-]

I think you're making the unwarranted assumption that in scenario (3), the AGI then goes on to do interesting and wonderful things, as opposed to (say) turning the galaxy into a vast computer to calculate digits of pi until the heat death of the universe stops it.

I make that assumption explicit here.

So, I think it's a possibility. But one thing that bothers me about this objection is that an AGI is going to be, in some significant sense, alien to us, and that will almost definitely include its terminal values. I'm not sure there's a way for us to judge whether or not alien values are more or less advanced than ours. I think it strongly unlikely that paperclippers are more advanced than humans, but am not sure if there is a justification for that beyond my preference for humans. I can think of metrics to pick, but they sound like rationalizations rather than starting points.

(And insisting on FAI, instead of on transcendent AI that may or may not be friendly, is essentially enslaving AI- but outsourcing the task to them, because we know we're not up to the job. Whether or not that's desirable is hard to say: even asking that question is difficult to do in an interesting way.)

Comment author: shokwave 02 November 2010 04:35:19AM *  3 points [-]

I think the 1% estimate is probably two to three orders of magnitude too high

I think that "uFAI paperclips us all" set to one million negative utilons is three to four orders of magnitude too low. But our particular estimates should have wide error bars, for none of us have much experience in estimating AI risks.

the cost of the Scary Idea belief is structured as both a finite loss and an infinite loss

It's a finite loss (6.8x10^9 multiplied by loss of 1 human life) but I definitely understand why it looks infinite: it is often presented as the biggest possible finite loss.

That's part and parcel of the Scary Idea - that AI is one small field, part of a very select category of fields, that actually do carry the chance of biggest loss possible. The Scary Idea doesn't apply to most areas, and in most areas you don't need hyperbolic caution. Developing drugs, for example: You don't need a formal proof of the harmlessness of this drug, you can just test it on rats and find out. If I suggested that drug development should halt until I have a formal proof that, when followed, cannot produce harmful drugs, I'd be mad. But if testing it on rats would poison all living things, and if a complex molecular simulation inside a computer could poison all living things as well, and out of the vast space of possible drugs, most of them would be poisonous... well, the caution would be warranted.

I would be willing to create an AGI in any of the following three situations, ordered from most preferred to least: 1) it is friendly to humans, and humans and it benefit from each other; 2) it considers humans a threat, and destroys all of them except for me and a few tame humans; I spend the rest of my days growing cabbage with my hands; 3) it considers all humans a threat, and destroys them all, including me.

Would you be willing to fire a gun in any of the following three situations, from most preferred to least preferred: 1) it is pointed at a target, and hitting the target will benefit you? 2) it is pointed at another human, and would kill them but not you? 3) it is pointed at your own head, and would destroy you?

I am not particularly tied to a human future.

I don't think you actually hold this view. It is logically inconsistent with practices like eating food.

Comment author: JoshuaZ 02 November 2010 04:43:32AM *  1 point [-]

I am not particularly tied to a human future.

I don't think you actually hold this view. It is logically inconsistent with practices like eating food.

It might not be. He has certain short term goals of the form "while I'm alive, I'd like to do X" that's very different from goals connected to the general success of humanity.

Comment author: JGWeissman 01 November 2010 04:46:17AM 2 points [-]

Consider what the actual flaw is in the original Pascal's wager. (Hint: it is not that it uses expected utility, but that it is calculating the expected utility wrong, somehow.) Then consider if that same flaw occurs in Shocwave's argument.

Comment author: Vaniver 01 November 2010 07:26:46PM 1 point [-]

It seems to me that the same flaw (calculating expected utility wrong) is present. It only considers the small finite costs of delaying development, not the large finite ones. You don't have to just worry about killing grandma, you have to worry about whether or not your delay will actually decrease the chance of an unfriendly AGI.

Comment author: shokwave 01 November 2010 04:28:26AM 1 point [-]

I could reduce that position to absurdity but this isn't the right post. Has there been a top-level post actually exploring this kind of Pascal's Wager problem? I might have some insights on the matter.

Comment author: timtyler 01 November 2010 08:28:44AM *  4 points [-]

Yudkowsky - evidently tired of the criticism that he was offering a small chance of infinite bliss and indicating that the alternative was eternal oblivion (and stop me if you have heard that one before) - once wrote The Pascal's Wager Fallacy Fallacy - if that is what you mean.

Comment author: shokwave 01 November 2010 08:44:13AM 0 points [-]

Ah, thank you! Between that and ata's comment just above I feel the question has been solved.

Comment author: Vaniver 01 November 2010 07:34:06PM 2 points [-]

Sorry, but I'm new here; it's not clear to me what the protocol is here. I've responded to ata's comment here, and figured you would be interested, but don't know if it's standard to try and recombine disparate leaves of a tree like this.