You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

ChristianKl comments on Open thread, September 2-8, 2013 - Less Wrong Discussion

0 Post author: David_Gerard 02 September 2013 02:07PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (376)

You are viewing a single comment's thread. Show more comments above.

Comment author: ChristianKl 09 September 2013 11:09:18AM *  1 point [-]

Likewise, if an AI's decision-making algorithm is immutably hard-coded as "think about the alternatives and select the one that's rated the highest", then the AI would not be able to simply "write a new AI … and then just hand off all its tasks to it"; in order to do that, it would somehow have to make it so that the highest-rated alternative is always the one that the new AI would pick.

You basically say that the AI should be unable to learn to trust a process that was effective in the past to also be effective in the future. I think that would restrict intelligence a lot.

Comment author: [deleted] 10 September 2013 03:20:02AM 0 points [-]

Yeah, that's a good point. What I want to say is, "oh, a non-self-modifying AI would still be able to hand off control to a sub-AI, but it will automatically check to make sure the sub-AI is behaving correctly; it won't be able to turn off those checks". But my idea here is definitely starting to feel more like a pipe dream.

Comment author: Armok_GoB 16 September 2013 01:06:52AM *  0 points [-]

Hmm, might still be something gleaned for attempting to steelman this or work in different related directions.

Edit; maybe something with an AI not being able to tolerate things it can't make certain proofs about? Problem is it'd have to be able to make those proofs about humans if they are included in its environment, and if they are not it might make UFAI there (Intuition pump; a system that consists of a program it can prove everything about, and humans that program asks questions to). Yea this doesn't seem very useful.

Comment author: ChristianKl 10 September 2013 05:51:37PM 0 points [-]

You can't really tell whether something that is smarter than yourself is behaving correctly. In the end a non-self-modifying AI checking on whether a self-modifying sub-AI is behaving correctly isn't much different from a safety perspective than a human checking whether the self modifying AI is behaving correctly.