You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

Pentashagon comments on I played the AI Box Experiment again! (and lost both games) - Less Wrong Discussion

35 Post author: Tuxedage 27 September 2013 02:32AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (123)

You are viewing a single comment's thread. Show more comments above.

Comment author: Pentashagon 29 September 2013 05:19:18AM 3 points [-]

Yet those who are complacent about it are the most susceptible.

That sounds similar to hypnosis, to which a lot of people are susceptible but few think they are. So if you want a practical example of AI escaping the box just imagine an operator staring at a screen for hours with an AI that is very adept at judging and influencing the state of human hypnosis. And that's only a fairly narrow approach to success for the AI, and one that has been publicly demonstrated for centuries to work on a lot of people.

Personally, I think I could win the game against a human but only by keeping in mind the fact that it was a game at all times. If that thought ever lapsed, I would be just as susceptible as anyone else. Presumably that is one aspect of Tuxedage's focus on surprise. The requirement to actively respond to the AI is probably the biggest challenge because it requires focusing attention on whatever the AI says. In a real AI-box situation I would probably lose fairly quickly.

Now what I really want to see is an AI-box experiment where the Gatekeeper wins early by convincing the AI to become Friendly.

Comment author: shminux 29 September 2013 05:26:44AM 2 points [-]

Now what I really want to see is an AI-box experiment where the Gatekeeper wins early by convincing the AI to become Friendly.

That's hard to check. However, there was a game where the gatekeeper convinced the AI to remain in the box.

Comment author: Tuxedage 02 October 2013 07:57:10AM 0 points [-]

However, there was a game where the gatekeeper convinced the AI to remain in the box.

I did that! I mentioned that in this post:

http://lesswrong.com/lw/iqk/i_played_the_ai_box_experiment_again_and_lost/9thk

Comment author: [deleted] 30 September 2013 02:10:31PM 1 point [-]

Now what I really want to see is an AI-box experiment where the Gatekeeper wins early by convincing the AI to become Friendly.

Not quite the same, but have you read Watchmen? Specifically, the conversation that fvyx fcrpger naq qe znaunggna unir ba znef. (Disclaimer: it's been a while since I read it and I make no claims on the strength of this argument.)

Comment author: Tuxedage 02 October 2013 07:56:27AM *  0 points [-]

Now what I really want to see is an AI-box experiment where the Gatekeeper wins early by convincing the AI to become Friendly.

I did that! I mentioned that in this post:

http://lesswrong.com/lw/iqk/i_played_the_ai_box_experiment_again_and_lost/9thk