Liso comments on Superintelligence 11: The treacherous turn - Less Wrong

10 Post author: KatjaGrace 25 November 2014 02:00AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (50)

You are viewing a single comment's thread. Show more comments above.

Comment author: William_S 25 November 2014 04:23:49AM 1 point [-]

A (not very useful) AI filter: Suppose you have two AIs of the same architecture, one of which has friendly values and one of which has unfriendly values. You run them through a set of safety tests which you are confident only a friendly AI could pass, and both AIs pass them. You run them through a set of capability/intelligence tests and they both perform equally well. It seems that the unfriendly AI is in a slightly unfavourable position. First, it has to preserve the information content of its utility function or other value representation, in addition to the information content possessed by the friendly AI. The unfriendly AI would also need more preparation time at some stage in order to conceal its utility function. Thus, if you measure storage and time costs, whichever AI is smaller and faster is likely the friendly one. However, I don't think this directly yields anything good in practice, as the amount of extra information content could be very small, especially if the unfriendly AI simplifies its utility function. Also, you need a friendly AI...

Comment author: Liso 27 November 2014 06:25:27AM *  1 point [-]

It seems that the unfriendly AI is in a slightly unfavourable position. First, it has to preserve the information content of its utility function or other value representation, in addition to the information content possessed by the friendly AI.

There are two sorts of unsafe AI: one which care and one which doesnt care.

Ignorant is fastest - only calculate answer and doesn't care of anything else.

Friend and enemy has to analyse additional things...

Comment author: wedrifid 27 November 2014 08:01:09AM -1 points [-]

Ignorant is fastest - only calculate answer and doesn't care of anything else.

Just don't accidentally give it a problem that is more complex than you expect. Only caring about solving such a problem means tiling the universe with computronium.