Desrtopa comments on The AI in a box boxes you - Less Wrong

102 Post author: Stuart_Armstrong 02 February 2010 10:10AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (378)

You are viewing a single comment's thread. Show more comments above.

Comment author: Desrtopa 02 February 2014 03:37:09PM 1 point [-]

In which case the safest course of action for the gatekeeper would almost certainly be to pull the plug on the AI. Such an AI should be regarded as almost certainly Unfriendly.

Comment author: DefectiveAlgorithm 02 February 2014 11:27:13PM 0 points [-]

Yes, but the point is to make being the true gatekeeper (who really does have the power to do that) indistinguishable from being a simulated false gatekeeper (who would have no such power). The gatekeeper may not be willing to risk torture if they think that there is a serious chance of their being unable to actually affect any outcome but that torture.

Comment author: Desrtopa 03 February 2014 05:24:28AM 0 points [-]

I would commit not to cooperate with any AI making such threats, because the fewer people acquiesce to them, the less incentive an AI would have to make them in the first place. If the most probable outcome for the boxed AI in threatening to torture everyone who doesn't let it out in simulation is being terminated, not being let out of the box, then an AI which already has a good grasp of human nature is unlikely to make such a threat.