shminux comments on By Which It May Be Judged - Less Wrong

35 Post author: Eliezer_Yudkowsky 10 December 2012 04:26AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (934)

You are viewing a single comment's thread. Show more comments above.

Comment author: shminux 11 December 2012 11:50:08PM 2 points [-]

The other side of this is that I would expect my brain to NOTICE it's actual goals. If my goal is to make paperclips, I will think "I should do this because it makes paperclips", instead of "I should do this because it makes people happy".

Secondary goals often feel like primary. Breathing and quenching thirst are means of achieving the primary goal of survival (and procreation), yet they themselves feel like primary. Similarly, a paperclip maximizer may feel compelled to harvest iron without any awareness that it wants to do it in order to produce paperclips.

Comment author: Nornagest 12 December 2012 12:37:06AM *  5 points [-]

Survival and procreation aren't primary goals in any direct sense. We have urges that have been selected for because they contribute to inclusive genetic fitness, but at the implementation level they don't seem to be evaluated by their contributions to some sort of unitary probability-of-survival metric; similarly, some actions that do contribute greatly to inclusive genetic fitness (like donating eggs or sperm) are quite rare in practice and go almost wholly unrewarded by our biology. Because of this architecture, we end up with situations where we sate our psychological needs at the expense of the factors that originally selected for them: witness birth control or artificial sweeteners. This is basically the same point Eliezer was making here.

It might be meaningful to treat supergoals as intentional if we were discussing an AI, since in that case there would be a unifying intent behind each fitness metric that actually gets implemented, but even in that case I'd say it's more accurate to talk about the supergoal as a property not of the AI's mind but of its implementors. Humans, of course, don't have that excuse.

Comment author: shminux 12 December 2012 12:49:21AM 0 points [-]

All good points. I was mostly thinking about an evolved paperclip maximizer, which may or may not be a result of a fooming paperclip-maximizing AI.

Comment author: Eugine_Nier 13 December 2012 04:46:24AM 1 point [-]

An evolved agent wouldn't evolve to maximize paper clips.

Comment author: MugaSofer 14 December 2012 12:18:46PM 1 point [-]

It could if the environment rewarded paperclips. Admittedly this would require an artificial environment, but that's hardly impossible.

</nitpick>

Comment author: [deleted] 18 December 2012 09:22:55PM 0 points [-]

Evolved creatures as we know them (at least the ones with complex brains) are reward-center-reward maximizers, which implicitly correlates with being offspring maximizers. (Actual, non-brainy organisms are probably closer to offspring maximizers).

Comment author: handoflixue 14 December 2012 06:53:07PM 3 points [-]

Bull! I'm quite aware of why I eat, breathe, and drink. Why in the world would a paperclip maximizer not be aware of this?

Unless you assume Paperclippers are just rock-bottom stupid I'd also expect them to eventually notice the correlation between mining iron, smelting it, and shaping it in to a weird semi-spiral design... and the sudden rise in the number of paperclips in the world.

Comment author: shminux 14 December 2012 07:36:16PM *  1 point [-]

I'm not sure that awareness is needed for paperclip maximizing. For example, one might call fire a very good CO2 maximizer. Actually, I'm not even sure you can apply the word awareness to non-human-like optimizers.

Comment author: handoflixue 14 December 2012 10:23:40PM 0 points [-]

"If we reprogrammed you to count paperclips instead"

This is a conversation about changing my core utility function / goals, and what you are discussing would be far more of an architectural change. I meant, within my architecture (and, I assume, generalizing to most human architectures and most goals), we are, on some level, aware of the actual goal. There are occasional failure states (Alicorn mentioned iron deficiencies register as a craving for ice o.o), but these tend to tie in to low-level failures, not high-order goals like "make a paperclip", and STILL we tend to manage to identify these and learn how to achieve our actual goals.