You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

anotheruser comments on asking an AI to make itself friendly - Less Wrong Discussion

-4 Post author: anotheruser 27 June 2011 07:06AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (30)

You are viewing a single comment's thread. Show more comments above.

Comment author: anotheruser 29 June 2011 06:53:34AM *  -2 points [-]

that would have to be a really sophisticated bug to misinterpret "always answer questions thruthfully as far as possible while admitting uncertainty" as "kill all humans". I'd imagine that something as drastic as that could be found and corrected long before that. Consider that you have its goal set to this. It knows no other motivation but to respond thruthfully. It doesn't care about the survival of humanity, or itself or about how reality really is. All it cares for is to answer the questions to the best of its abilities.

I don't think that this goal would be all too hard to define either, as "the truth" is a pretty simple concept. As long it deals with uncertainty in the right way (by admitting it), how could this be misinterpreted? Friendliness is far harder to define because we don't even know a definition for it ourselves. There are far too many things to consider when defining "friendliness".

Comment author: Larks 29 June 2011 03:24:15PM 4 points [-]

Trivial Failure Case: The AI turns the universe into hardware to support really big computations, so it can be really sure it's got the right answer, and also callibrate itself really well on the uncertainty.