You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

Manfred comments on [draft] Concepts are Difficult, and Unfriendliness is the Default: A Scary Idea Summary - Less Wrong Discussion

7 Post author: Kaj_Sotala 31 March 2012 10:07AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (39)

You are viewing a single comment's thread. Show more comments above.

Comment author: Manfred 31 March 2012 12:17:16PM *  2 points [-]

The way to fix the quoted argument is to have the utility function be random, grafted on to some otherwise-functioning AI.

A random utility function is maximized by a random state of the universe. And most arrangements of the universe don't contain humans. If the AI's utility function doesn't some how get maximized by one of the very few states that contains humans, it's very clearly unfriendly because it wants to replace humans with something else.

Comment author: Dmytry 31 March 2012 12:56:49PM *  4 points [-]

The way to fix the quoted argument is to have the utility function be random, grafted on to some otherwise-functioning AI.

Not demonstrably doable, arises from wrong intuitions arising from thinking too much about the AIs with oracular powers of prediction which straightforwardly maximize the utility, rather than of realistic cases - on limited hardware - which have limited foresight and employ instrumental strategies and goals which have to be derived from the utility function (and which can alter the utility function unless it is protected. The fact that utility modification is against the utility itself is insufficient when employing strategies and limited foresight).

Furthermore, an utility function can be self destructive.

A random utility function is maximized by a random state of the universe.

False. A random code for a function crashes (or never terminates). Of the codes that do not crash, simplest codes massively predominate. Demonstrably false if you try to generate random utility functions by generating random C code, which evaluate the utility of some test environment.

The problem I have with those arguments is that a: many things are plain false, and b: you try to 'fix' stuff by bolting in more and more conjunctions ('you can graft random utility functions onto well functioning AIs') into your giant scary conjunction, instead of updating, when contradicted. That's the definite sign of rationalization. It can also always be done no matter how much counter argument there exist - you can always add something into scary conjunction to make it happen. Adding conditions into conjunction should decrease it's probability.

Comment author: Manfred 31 March 2012 01:28:50PM *  1 point [-]

Function as in function.

Comment author: Dmytry 31 March 2012 01:51:21PM *  1 point [-]

I'd rather be concerned with implementations of functions, like Turing machine tapes, or C code, or x86 instructions, or the like.

In any case the point is rather moot because the function is human generated. Hopefully humans can do better than random, albeit i wouldn't wager on this - the FAI attempts are potentially worrisome as humans are sloppy programmers, and bugged FAIs would follow different statistics entirely. Still, I would expect bugged FAIs to be predominantly self destructive. (I'm just not sure if the non-self-destructive bugged FAI attempts are predominantly mankind-destroying or not)

Comment author: David_Gerard 01 April 2012 08:16:21PM -1 points [-]

In the days when Sussman was a novice, Minsky once came to him as he sat hacking at the PDP-6.

“What are you doing?”, asked Minsky.

“I am training a randomly wired neural net to play Tic-Tac-Toe” Sussman replied.

“Why is the net wired randomly?”, asked Minsky.

“I do not want it to have any preconceptions of how to play”, Sussman said.

Minsky then shut his eyes.

“Why do you close your eyes?”, Sussman asked his teacher.

“So that the room will be empty.”

At that moment, Sussman was enlightened.

-- AI Koans