You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

shminux comments on I played the AI Box Experiment again! (and lost both games) - Less Wrong Discussion

35 Post author: Tuxedage 27 September 2013 02:32AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (123)

You are viewing a single comment's thread. Show more comments above.

Comment author: shminux 28 September 2013 07:14:10AM 3 points [-]

It's not fabricated, be sure of that (knowing Tuxedage from IRC, I'd put the odds of 100,000:1 or more against fabrication). And yes, it's strange. I, too, cannot imagine what someone can possibly say that would make me get even close to considering letting them out of the box. Yet those who are complacent about it are the most susceptible.

Comment author: [deleted] 28 September 2013 04:37:03PM *  10 points [-]

knowing Tuxedage from IRC, I'd put the odds of 100,000:1 or more against fabrication

I know this is off-topic, but is it really justifiable to put so high odds on this? I wouldn't use so high odds even if I had known the person intimately for years. Is it justifiable or is this just my paranoid way of thinking?

Comment author: Pentashagon 29 September 2013 05:19:18AM 3 points [-]

Yet those who are complacent about it are the most susceptible.

That sounds similar to hypnosis, to which a lot of people are susceptible but few think they are. So if you want a practical example of AI escaping the box just imagine an operator staring at a screen for hours with an AI that is very adept at judging and influencing the state of human hypnosis. And that's only a fairly narrow approach to success for the AI, and one that has been publicly demonstrated for centuries to work on a lot of people.

Personally, I think I could win the game against a human but only by keeping in mind the fact that it was a game at all times. If that thought ever lapsed, I would be just as susceptible as anyone else. Presumably that is one aspect of Tuxedage's focus on surprise. The requirement to actively respond to the AI is probably the biggest challenge because it requires focusing attention on whatever the AI says. In a real AI-box situation I would probably lose fairly quickly.

Now what I really want to see is an AI-box experiment where the Gatekeeper wins early by convincing the AI to become Friendly.

Comment author: shminux 29 September 2013 05:26:44AM 2 points [-]

Now what I really want to see is an AI-box experiment where the Gatekeeper wins early by convincing the AI to become Friendly.

That's hard to check. However, there was a game where the gatekeeper convinced the AI to remain in the box.

Comment author: Tuxedage 02 October 2013 07:57:10AM 0 points [-]

However, there was a game where the gatekeeper convinced the AI to remain in the box.

I did that! I mentioned that in this post:

http://lesswrong.com/lw/iqk/i_played_the_ai_box_experiment_again_and_lost/9thk

Comment author: [deleted] 30 September 2013 02:10:31PM 1 point [-]

Now what I really want to see is an AI-box experiment where the Gatekeeper wins early by convincing the AI to become Friendly.

Not quite the same, but have you read Watchmen? Specifically, the conversation that fvyx fcrpger naq qe znaunggna unir ba znef. (Disclaimer: it's been a while since I read it and I make no claims on the strength of this argument.)

Comment author: Tuxedage 02 October 2013 07:56:27AM *  0 points [-]

Now what I really want to see is an AI-box experiment where the Gatekeeper wins early by convincing the AI to become Friendly.

I did that! I mentioned that in this post:

http://lesswrong.com/lw/iqk/i_played_the_ai_box_experiment_again_and_lost/9thk

Comment author: Ishaan 28 September 2013 08:17:08PM *  3 points [-]

Yeah, my gut doesn't feel like it's fabricated - Tuxedage and Eliezer would have to both be in on it and that seems really unlikely. And I can't think of a motive, except perhaps as some sort of public lesson in noticing confusion, and that too seems far fetched.

I've just picked up the whole "if it's really surprising it might be because its not be true" instinct from having been burned in the past by believing scientific findings that were later debunked, and now Lesswrong has condensed that instinct into a snappy little "notice confusion" cache. And this is pretty confusing.

I suppose a fabrication would be more confusing, in one sense.

Comment author: enfield 02 October 2013 11:56:15PM *  0 points [-]

yeah i think appealing to fabrication can be a bit hand-wavy sometimes. like you're saying it's fabricated like how other things are fabricated (since as we all know fabrication happens). but not every fabrication is the same or equally as easy to pull off. to say it was fabricated doesn't say anything about how it was. but that's not even a question that enters ones mind when they think of fabrication. how? well how anything else is fabricated of course..

it can be as much a reaction of disbelief as it is an alternative explanation.