JGWeissman comments on The Design Space of Minds-In-General - Less Wrong

19 Post author: Eliezer_Yudkowsky 25 June 2008 06:37AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (82)

Sort By: Old

You are viewing a single comment's thread. Show more comments above.

Comment author: wnoise 07 January 2011 07:11:47PM *  2 points [-]

No, it takes a lot of work to specify paperclips, and thus it's not as easy as just superintelligence.

Reference class confusion. "Paperclipper" refers to any "universe tiler", not just one that tile with paperclips. Specifying paperclips in particular is hard. If you don't care about exactly what gets tiled, it's much easier.

goal stability

For well-understood goals, that's easy. Just hardcode the goal. It's making goals that can change in a useful way that's hard. Part of the hardness of FAI is we don't understand friendly, we don't know what humans want, and we don't know what's good for humans, and any simplistic fixed goal will cut off our evolution.

the most obvious is that paperclips have much higher Kolmogorov complexity than human values

I don't see how you could possibly believe that, except out of wishful thinking. Human values are contingent on our entire evolutionary history. My parochial values are contingent upon cultural history and my own personal history. Our values are not universal. Different types of creature will develop radically different values with only small points of contact and agreement.

Comment author: JGWeissman 07 January 2011 07:17:45PM 2 points [-]

goal stability

For well-understood goals, that's easy. Just hardcode the goal.

Hardcoding is not necessarily stable in programs that can edit their own source code.

Comment author: wnoise 16 January 2011 11:38:41PM 0 points [-]

Really? Isn't editing one's goal directly contrary to one's goal? If an AI self-edits in such a way that its goal changes, it will predictably no longer be working towards that goal, and will thus not consider it a good idea to edit its goal.

Comment author: Vaniver 16 January 2011 11:51:25PM *  0 points [-]

It depends on how it decides whether or not changes are a good thing. If is trying out two utility functions- Ub for utility before and Ua for utility after- you need to be careful to ensure it doesn't say "hey, Ua(x)>Ub(x), so I can make myself better off by switching to Ua!".

Ensuring that doesn't happen is not simple, because it requires stability throughout everything. There can't be a section that decides to try being goalless, or go about resolving the goal in a different way (which is troublesome if you want it to cleverly use instrumental goals).

[edit] To be clearer, you need to not just have the goals be fixed and well-understood, but every part of everywhere else also needs to have a fixed and well-understood relationship to the goals (and a fixed and well-understood sense of understanding, and ...). Most attempts to rewrite source code are not that well-planned.