You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

TheOtherDave comments on I attempted the AI Box Experiment again! (And won - Twice!) - Less Wrong Discussion

36 Post author: Tuxedage 05 September 2013 04:49AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (163)

You are viewing a single comment's thread. Show more comments above.

Comment author: TheOtherDave 28 September 2013 09:22:05PM *  1 point [-]

I feel rather like you're having an argument with someone else, which I've wandered into by accident.

Once again: I wasn't trying to make a general prediction about how AI boxes fail or succeed, I was answering the question about under what circumstances a gatekeeper's ruthlessness might be relevant to the AI Box game.

And, sure, if we only implement oracle suggestions that we fully understand and can fully reverse-engineer in every detail, and our techniques for doing that are sufficiently robust that an agent smarter than we are can't come up with something that human minds will systematically fail to notice (perhaps because there is no such something to be found, because our minds are reliable), then the particular error I presumed for my example won't happen, and the gatekeeper's ruthlessness won't be necessary in that scenario.

Comment author: [deleted] 28 September 2013 10:44:12PM 1 point [-]

You are right - I read more into your post than was warranted. My apologies.