You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

ThrustVectoring comments on AI box: AI has one shot at avoiding destruction - what might it say? - Less Wrong Discussion

18 Post author: ancientcampus 22 January 2013 08:22PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (354)

You are viewing a single comment's thread. Show more comments above.

Comment author: orthonormal 23 January 2013 04:18:41AM 10 points [-]

Step One: The AI reveals a personal secret of the Gatekeeper's, which (so far as the Gatekeeper can think it through) could not possibly be known by the AI if it were still safely inside the box.

Step Two: Assorted threats, promises, mindfucks.

(This might not work, since as a Gatekeeper I'd still feel the impulse to respond with "AI DESTROYED", but it's a damn sight scarier than a transparent "I'm already out of the box" bluff. And as for feasibility, I'm willing to believe that an AI making better use of its data than we do could figure out personal secrets that we would think impossible— possibly even ones that I'd think impossible, even given this very belief.)

Comment author: ThrustVectoring 23 January 2013 08:10:14PM 8 points [-]

Even merely human cold readers can gain information that you think is impossible for them to know. It's a viable plan.