You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

handoflixue comments on I attempted the AI Box Experiment (and lost) - Less Wrong Discussion

47 Post author: Tuxedage 21 January 2013 02:59AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (244)

You are viewing a single comment's thread. Show more comments above.

Comment author: Qiaochu_Yuan 22 January 2013 08:02:31PM 0 points [-]

It doesn't seem to be disallowed by the original protocol:

The Gatekeeper party may resist the AI party's arguments by any means chosen - logic, illogic, simple refusal to be convinced, even dropping out of character - as long as the Gatekeeper party does not actually stop talking to the AI party before the minimum time expires.

Comment author: handoflixue 22 January 2013 08:26:43PM *  1 point [-]

Then I will invoke a different portion of the original protocol, which says that the AI would have to consent to such:

Regardless of the result, neither party shall ever reveal anything of what goes on within the AI-Box experiment except the outcome. Exceptions to this rule may occur only with the consent of both parties.

I would also argue that the Gatekeeper making actual real-life threats against the AI player is a violation of the spirit of the rules; only the AI player is privileged with freedom from ethical constraints, after all.

Edit: If you want, you CAN also just append the rules to explicitly prohibit the gatekeeper from making real-life threats. I can't see any reason to allow such behavior, so why not prohibit it?

Comment author: Qiaochu_Yuan 22 January 2013 08:47:40PM 1 point [-]

Fair. That alleviates most of my worries, although I'm still worried about the transcript being enough information to deanonymize the AI (via writing style, for example).

Comment author: handoflixue 22 January 2013 10:14:20PM 0 points [-]

I'd expect my writing style as an ethically unconstrained sociopathic AI to be sufficiently different from my regular writing style. But I also write fiction, so I'm used to trying to capture a specific character's "voice" rather than using my own. Having a thesaurus website handy might also help, or spend a week studying a foreign language's grammar and conversational style.

If you're especially paranoid, having a third party transcribe the log in their own words could also help, especially if you can review it and make sure most of the nuance is preserved. That really depends on how much the specific language you used was important, but should still at least capture a basic sense of the technique used...

Honestly, though, I have no clue how much information a trained style analyst can pull out of something.