You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

wedrifid comments on I attempted the AI Box Experiment (and lost) - Less Wrong Discussion

47 Post author: Tuxedage 21 January 2013 02:59AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (244)

You are viewing a single comment's thread. Show more comments above.

Comment author: MugaSofer 23 January 2013 02:43:26PM 3 points [-]

The test is supposed to be played against someone who thinks they can actually box an AI. If you destroy the AI because no-one could possibly survive talking to it, then you are not the intended demographic for such demonstrations.

Comment author: wedrifid 23 January 2013 03:02:36PM *  4 points [-]

The test is supposed to be played against someone who thinks they can actually box an AI. If you destroy the AI because no-one could possibly survive talking to it, then you are not the intended demographic for such demonstrations.

This isn't relevant to the point of the grandparent. It also doesn't apply to me. I actually think there is a distinct possibility that I'd survive talking to it for a period. "No-one could possibly survive" is not the same thing as "there is a chance of catastrophic failure and very little opportunity for gain".

Do notice, incidentally, that the AI DESTROYED command is delivered in response to a message that is both a crude manipulation attempt (ie. it just defected!) and an incompetent manipulation attempt (a not-very-intelligent AI cannot be trusted to preserve its values correctly while self improving). Either of these would be sufficient. Richard's example was even worse.

Comment author: MugaSofer 24 January 2013 12:04:56PM -2 points [-]

Good points. I'm guessing a nontrivial amount of people who think AI boxing is a good idea in reality wouldn't reason that way - but it's still not a great example.