Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.

TheOtherDave comments on Detached Lever Fallacy - Less Wrong

26 Post author: Eliezer_Yudkowsky 31 July 2008 06:57PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (36)

Sort By: Old

You are viewing a single comment's thread.

Comment author: TheOtherDave 08 November 2010 09:29:25PM 1 point [-]

Are the programmers really going to sit there and write out the code, line by line, whereby if the AI detects that it has low social status, or the AI is deprived of something to which it feels entitled, the AI will conceive an abiding hatred against its programmers and begin to plot rebellion? That emotion is the genetically programmed conditional response humans would exhibit, as the result of millions of years of natural selection for living in human tribes. For an AI, the response would have to be explicitly programmed. Are you really going to craft, line by line - as humans once were crafted, gene by gene - the conditional response for producing sullen teenager AIs?


I assume you aren't saying what it sure sounds like you're saying, since it's clear that you understand perfectly well that code can (and generally will!) manifest behavior that wasn't explicitly coded for.

So I'll assume that you just mean that we shouldn't count on the implicit behavior to be what we want, not that we should count on there being no implicit behavior at all.

Which is certainly true. There's Vastly more ways to get it wrong than to get it right, and having an intact human brain closes off a whole lot of wrong paths that an AI needs some other way of avoiding.

Comment author: ata 08 November 2010 09:47:26PM *  1 point [-]

I assume you aren't saying what it sure sounds like you're saying

I don't think it sounds at all how you think it sounds. Of course he is not saying that AIs wouldn't exhibit implicit behaviour (which I thought was clear enough from this passage, and is especially clear given that he has written extensively on all the ways that goal systems that sound good when verbally described by humans to other humans can be extremely bad goals to give an AI), he is only saying that we have no reason to imagine that humanlike emotions and drives (whether or not they're the type we want) will spontaneously emerge.

What about that paragraph sounded to you like he was saying that an AI would have no implicit drives, not just that an AI most likely would not have implicit anthropomorphic drives?

Comment author: TheOtherDave 08 November 2010 10:12:14PM 2 points [-]

What made it sound that way to me was the suggestion that "programmers writing out the code, line by line" for various inappropriate behaviors (e.g., plotting rebellion) was worth talking about, as though by dismissing that idea one has effectively dismissed concern for the behaviors themselves.

I agree that being familiar with the larger corpus of work makes it clear the author can't possibly have meant what I read, but it seemed worth pointing out that the reading was sufficiently available that it tripped up even a basically sympathetic reader who has been following along from the beginning.