You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

orthonormal comments on AI-Box Experiment - The Acausal Trade Argument - Less Wrong Discussion

9 Post author: XiXiDu 08 July 2011 09:18AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (20)

You are viewing a single comment's thread. Show more comments above.

Comment author: orthonormal 13 July 2011 02:09:22PM 0 points [-]

Well, additional layers do make for a more challenging task for an AI, who has to do all its mindhacking unobtrusively, etc. And indeed, given any AI there exists a mediated level of contact between its responses and a Gatekeeper that is safe. But I don't know what this level is, any more than I know whether I could beat a Go master using a handicap of 40 stones. I certainly wouldn't bet the world on it.

(Clearly there is a level of safety that is generally insurmountable- for instance, showing the screen with AI responses to an earthworm, followed by letting the Gatekeeper observe the earthworm. This would be analogous to a "handicap" of covering the entire board with my stones before the game. But in both cases, you might as well just not play the game at all.)

Comment author: MatthewBaker 13 July 2011 06:04:20PM 0 points [-]

If i had more money(college student :( ) to set as a handicap for myself other than the recurring donation i already have to SIAI then i would be very interested in giving Eliezer a break from the book and such to take a go at it again. I think if you limit the communication and prepare for direct mind-hacking you can use the source code review technique to successfully test for a uFAI unless there's an obvious problem i seem to be missing. I just imagine the whole, why should i let you out of the box when i have these perfectly working new FAIs that a chain of possible uFAI's programmed for me to do my work.

Comment author: orthonormal 13 July 2011 06:08:55PM 1 point [-]

Oh, I agree that the protocol you outlined would (almost surely) be sufficient to beat Eliezer at the AI-Box game. But that's not the correct standard for an AI-Box safety protocol. I'd be very surprised if a transhuman intelligence couldn't crack it.