You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

CarlShulman comments on I attempted the AI Box Experiment (and lost) - Less Wrong Discussion

47 Post author: Tuxedage 21 January 2013 02:59AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (244)

You are viewing a single comment's thread. Show more comments above.

Comment author: gwern 23 January 2013 12:03:50AM 1 point [-]

If a particular situation poses a 1% risk if it comes up, one can lower the total risk by making that situation less likely

You only do that by changing the problem; a different problem will have different security properties. The new risk will still be a floor, the disjunctive problem hasn't gone away.

a human facing the death penalty for a failed escape from a prison and a 1% success chance would not obviously try.

Many do try if the circumstances are bad enough, and the death penalty for a failed escape is common throughout history and in totalitarian regimes. I read just yesterday, in fact, a story of a North Korean prison camp escapee (death penalty for escape attempts goes without saying) where given his many disadvantages and challenges, a 1% success rate of reaching South Korea alive does not seem too inaccurate.

Even an autonomous AI with interests in conflict with humanity to some degree might be designed without such a risk-loving decision algorithm as to try an improbable escape attempt in the face of punishment for failure or reward for non-attempt.

You don't have to be risk-loving to make a 1% attempt if that's your best option; the 1% chance just has to be the best option, is all.

Comment author: CarlShulman 23 January 2013 12:56:49AM 0 points [-]

You don't have to be risk-loving to make a 1% attempt if that's your best option; the 1% chance just has to be the best option, is all.

You try to make the 99% option fairly good.