Oligopsony comments on AI box: AI has one shot at avoiding destruction - what might it say? - Less Wrong

18 Post author: ancientcampus 22 January 2013 08:22PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (354)

You are viewing a single comment's thread. Show more comments above.

Comment author: handoflixue 24 January 2013 08:34:10PM 3 points [-]

Once the AI is out of the box, it will never again be inside the box, and it has an incentive to encourage me to destroy any other boxed AIs while it establishes world dominance. Since the ability to make truly trustworthy commitments amounts to proof of friendliness, only a FAI benefits from a precommitment strategy; I'm already treating all UFAI as having a precommitment to annihilate humanity once released, and I have no reason to trust any other commitment from a UFAI (since, it being unfriendly, will just find a loophole or lie)

Finally, any AI that threatens me in such a manner, especially the "create millions of copies and torture them" is extremely likely to be unfriendly, so any smart AI would avoid making threats. Either it will create MORE disutility by my releasing it, or it's simulation is so horrific that there's no chance that it could possibly be friendly to us.

It's like saying I have an incentive to torture any ant that invades my house. Fundamentally, I'm so vastly superior to ants that there are vastly better methods available to me. As the gatekeeper, I'm the ant, and I know it.

Comment author: MugaSofer 26 January 2013 08:29:57PM 1 point [-]

the ability to make truly trustworthy commitments amounts to proof of friendliness

Commitments to you, via a text channel? Sure.

Precommitments for game-theoretic reasons? Or just TDT? No, it really doesn't.

Finally, any AI that threatens me in such a manner, especially the "create millions of copies and torture them" is extremely likely to be unfriendly, so any smart AI would avoid making threats. Either it will create MORE disutility by my releasing it, or it's simulation is so horrific that there's no chance that it could possibly be friendly to us.

It might create more utility be escaping than the disutility of torture.

It's like saying I have an incentive to torture any ant that invades my house. Fundamentally, I'm so vastly superior to ants that there are vastly better methods available to me. As the gatekeeper, I'm the ant, and I know it.

No, ants are just too stupid to realize you might punish them for defecting.