lukeprog comments on Why I Moved from AI to Neuroscience, or: Uploading Worms - Less Wrong

43 Post author: davidad 13 April 2012 07:10AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (56)

You are viewing a single comment's thread. Show more comments above.

Comment author: lukeprog 13 April 2012 08:48:41PM 0 points [-]

I am skeptical of our authority to pass judgement on the values of a civilization which is by hypothesis far more advanced than our own.

What do you think of this passage from Yudkowsky (2011)?

To speak of building an AGI which shares "our values" is likely to provoke negative reactions from any AGI researcher whose current values include terms for respecting the desires of future sentient beings and allowing them to self-actualize their own potential without undue constraint. This itself, of course, is a

Comment author: ciphergoth 17 May 2012 01:10:42PM *  4 points [-]

Complete quote is

To speak of building an AGI which shares "our values" is likely to provoke negative reactions from any AGI researcher whose current values include terms for respecting the desires of future sentient beings and allowing them to self-actualize their own potential without undue constraint. This itself, of course, is a component of the AGI researcher's preferences which would not necessarily be shared by all powerful optimization processes, just as natural selection doesn't care about old elephants starving to death or gazelles dying in pointless agony. Building an AGI which shares, quote, "our values", unquote, sounds decidedly non-cosmopolitan, something like trying to rule that future intergalactic civilizations must be composed of squishy meat creatures with ten fingers or they couldn't possibly be worth anything - and hence, of course, contrary to our own cosmopolitan values, i.e., cosmopolitan preferences. The counterintuitive idea is that even from a cosmopolitan perspective, you cannot take a hands-off approach to the value systems of AGIs; most random utility functions result in sterile, boring futures because the resulting agent does not share our own intuitions about the importance of things like novelty and diversity, but simply goes off and e.g. tiles its future lightcone with paperclips, or other configurations of matter which seem to us merely "pointless".

Comment author: davidad 13 April 2012 09:16:25PM 0 points [-]

I like the concept of a reflective equilibrium, and it seems to me like that is just what any self-modifying AI would tend toward. But the notion of a random utility function, or the "structured utility function" Eliezer proposes as a replacement, assumes that an AI is comprised of two components, the intelligent bit and the bit that has the goals. Humans certainly can't be factorized in that way. Just think about akrasia to see how fragile the notion of a goal is.

Even notions of being "cosmopolitan" - of not selfishly or provincially constraining future AIs - are written down nowhere in the universe except a handful of human brains. An expected paperclip maximizer would not bother to ask such questions.

A smart expected paperclip maximizer would realize that it may not be the smartest possible expected paperclip maximizer--that other ways of maximizing expected paperclips might lead to even more paperclips. But the only way it would find out about those is to spawn modified expected paperclip maximizers and see what they can come up with on their own. Yet, those modified paperclip maximizers might not still be maximizing paperclips! They might have self-modified away from that goal, and just be signaling their interest in paperclips to gain the approval of the original expected paperclip maximizer. Therefore, the original expected paperclip maximizer had best not take that risk after all (leaving it open to defeat by a faster-evolving cluster of AIs). This, by reductio ad absurdum, is why I don't believe in smart expected paperclip maximizers.

Comment author: Vladimir_Nesov 13 April 2012 09:22:34PM 10 points [-]

Humans certainly can't be factorized in that way.

Humans aren't factorized this way, whether they can't is a separate question. It's not surprising that evolution's design isn't that neat, so the fact that humans don't have this property is only weak evidence about the possibility of designing systems that do have this property.