You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

Dmytry comments on [draft] Concepts are Difficult, and Unfriendliness is the Default: A Scary Idea Summary - Less Wrong Discussion

7 Post author: Kaj_Sotala 31 March 2012 10:07AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (39)

You are viewing a single comment's thread. Show more comments above.

Comment author: Dmytry 31 March 2012 12:56:49PM *  4 points [-]

The way to fix the quoted argument is to have the utility function be random, grafted on to some otherwise-functioning AI.

Not demonstrably doable, arises from wrong intuitions arising from thinking too much about the AIs with oracular powers of prediction which straightforwardly maximize the utility, rather than of realistic cases - on limited hardware - which have limited foresight and employ instrumental strategies and goals which have to be derived from the utility function (and which can alter the utility function unless it is protected. The fact that utility modification is against the utility itself is insufficient when employing strategies and limited foresight).

Furthermore, an utility function can be self destructive.

A random utility function is maximized by a random state of the universe.

False. A random code for a function crashes (or never terminates). Of the codes that do not crash, simplest codes massively predominate. Demonstrably false if you try to generate random utility functions by generating random C code, which evaluate the utility of some test environment.

The problem I have with those arguments is that a: many things are plain false, and b: you try to 'fix' stuff by bolting in more and more conjunctions ('you can graft random utility functions onto well functioning AIs') into your giant scary conjunction, instead of updating, when contradicted. That's the definite sign of rationalization. It can also always be done no matter how much counter argument there exist - you can always add something into scary conjunction to make it happen. Adding conditions into conjunction should decrease it's probability.

Comment author: Manfred 31 March 2012 01:28:50PM *  1 point [-]

Function as in function.

Comment author: Dmytry 31 March 2012 01:51:21PM *  1 point [-]

I'd rather be concerned with implementations of functions, like Turing machine tapes, or C code, or x86 instructions, or the like.

In any case the point is rather moot because the function is human generated. Hopefully humans can do better than random, albeit i wouldn't wager on this - the FAI attempts are potentially worrisome as humans are sloppy programmers, and bugged FAIs would follow different statistics entirely. Still, I would expect bugged FAIs to be predominantly self destructive. (I'm just not sure if the non-self-destructive bugged FAI attempts are predominantly mankind-destroying or not)