You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

handoflixue comments on AI box: AI has one shot at avoiding destruction - what might it say? - Less Wrong Discussion

18 Post author: ancientcampus 22 January 2013 08:22PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (354)

You are viewing a single comment's thread. Show more comments above.

Comment author: RichardKennaway 23 January 2013 12:24:24PM *  17 points [-]

One reason for Eliezer not publishing the logs of the AIbox experiment is to avoid people seeing how he got out and responding, "ok, so all we have to do to keep the AI in its box is avoid succumbing to that trick." This thread might just provide more fuel for that fallacy (as, I admit, I did in replying to Eliezer's original comment).

I'm sure that for everything an AI might say, someone can think up a reason for not being swayed, but it does not follow that for someone confronted with an AI, there is nothing that would sway them.

Comment author: handoflixue 23 January 2013 09:33:40PM 3 points [-]

I wouldn't expect any effective real-life gatekeeper to be swayed by my ability to destroy one-sentence AIs.