cousin_it comments on What can you do with an Unfriendly AI? - Less Wrong

16 Post author: paulfchristiano 20 December 2010 08:28PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (127)

You are viewing a single comment's thread. Show more comments above.

Comment author: Eliezer_Yudkowsky 21 December 2010 12:18:58AM 5 points [-]

Each genie you produce will try its best to find a solution to the problem you set it, that is, they will respond honestly. By hypothesis, the genie isn't willing to sacrifice itself just to destroy human society.

Thanks to a little thing called timeless decision theory, it's entirely possible that all the genies in all the bottles can cooperate with each other by correctly predicting that they are all in similar situations, finding Schelling points, and coordinating around them by predicting each other based on priors, without causal contact. This does not require that genies have similar goals, only that they can do better by coordinating than by not coordinating.

Comment author: cousin_it 21 December 2010 12:34:14AM *  0 points [-]

Right. This is very similar to some nightmare scenarios of AI escape that we discussed with Wei Dai earlier this autumn. Giving an AI a "wide" prior over the computational multiverse outside its box can make it do weird and scary things, like taking an action exactly when you're about to shut it off, or conspiring with other AIs as you outlined, or getting into acausal resource fights with Cthulhu and losing them... Part of me wants to see this stuff developed into a proper superweapon of math destruction, and another part of me goes "aieeeee" at the prospect.

Comment author: [deleted] 21 December 2010 02:24:42AM 1 point [-]

"wide" prior over the computational multiverse outside its box

Can you explain what you mean?

Comment author: cousin_it 21 December 2010 10:18:57AM *  0 points [-]

If the AI uses something like the Solomonoff prior, it can work out which worlds are most likely to contain such an AI. With a little intelligence it can probably figure out from its programming that humans are bipedal, that we run many AIs in boxes, that aliens on Alpha Centauri have built another AI that can turn out really helpful, etc.

Comment author: [deleted] 21 December 2010 01:56:43PM 1 point [-]

An AI with an uncomputable prior and an infinite amount of memory and time might be able to learn those things from its source code. But I think this is a bad way to approximate what a real superintelligence will be capable of.