wedrifid comments on Cryptographic Boxes for Unfriendly AI - Less Wrong

24 Post author: paulfchristiano 18 December 2010 08:28AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (155)

You are viewing a single comment's thread. Show more comments above.

Comment author: Eliezer_Yudkowsky 20 December 2010 06:17:52AM 3 points [-]

B is false.

Does this still hold if you remove the word "hostile", i. e. if the "friendliness" of the superintelligence you construct first is simply not known?

Heh. I'm afraid AIs of "unknown" motivations are known to be hostile from a human perspective. See Omohundro on the Basic AI Drives, and the Fragility of Value supersequence on LW.

Comment author: wedrifid 20 December 2010 06:42:46AM *  0 points [-]

Heh. I'm afraid AIs of "unknown" motivations are known to be hostile from a human perspective.

I'm not sure this is what D_Alex meant however a generous interpretation of 'unknown friendliness' could be that confidence in the friendliness of the AI is less than considered necessary. For example if there is an 80% chance that the unknown AI is friendly and b) and c) are both counter-factually assumed to true....

(Obviously the '80%' necessary could be different depending on how much of the uncertainty is due to pessimism regarding possible limits of your own judgement and also on your level of desperation at the time...)