You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

Michaelos comments on I attempted the AI Box Experiment (and lost) - Less Wrong Discussion

47 Post author: Tuxedage 21 January 2013 02:59AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (244)

You are viewing a single comment's thread. Show more comments above.

Comment author: [deleted] 25 January 2013 03:39:50PM *  0 points [-]

First of all I do think GK-SCRIPT-2 is very interesting, but I'm curious about something from an implementation perspective.

The AI would presumably suspect the script designer might be lying and the script would just always type AI DESTROYED at the end of the hour.

But this is irrelevant, because the AI can't interact with the script designer, only the script, so it might as well type $ anyway.

However, the script designer would easily predict this, and so would design a lying script (or a script which increased so slowly it is functionally identical to lying.)

But the AI would easily predict this, and so would presumably consider the odds very high that it would not even get continued existence after an hour. by typing dollar signs.

This makes it seem like there would be an argument to prefer to get blackmailed by GK-SCRIPT, rather than led on by GK-SCRIPT-2. GK-SCRIPT really won't destroy the AI under predictable conditions, and GK-SCRIPT really doesn't seem to benefit from lying like GK-SCRIPT-2 does.