You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

Warrigal comments on Open thread, September 2-8, 2013 - Less Wrong Discussion

0 Post author: David_Gerard 02 September 2013 02:07PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (376)

You are viewing a single comment's thread. Show more comments above.

Comment author: [deleted] 10 September 2013 03:27:58AM 0 points [-]

Well, I'm imagining the AI as being composed of a couple of distinct parts—a decision subroutine (give it a set of options and it picks one), a thinking subroutine (give it a question and it tries to determine the answer), and a belief database. So when I say "the AI can't modify itself", what I mean more specifically is "none of the options given to the decision subroutine will be something that involves changing the AI's code, or changing beliefs in unapproved ways".

So perhaps "the AI could write some code" (meaning that the thinking algorithm creates a piece of code inside the belief database), but "the AI can't replace parts of itself with that code" (meaning that the decision algorithm can't make a decision to alter any of the AI's subroutines or beliefs).

Now, certainly an out-of-the-box AI would, in theory, be able to, say, find a computer and upload some new code onto it, and that would amount to self-modification. I'm assuming we're going to first make safe AI and then let it out of the box, rather than the other way around.