handoflixue comments on By Which It May Be Judged - LessWrong

35 Post author: Eliezer_Yudkowsky 10 December 2012 04:26AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (934)

You are viewing a single comment's thread. Show more comments above.

Comment author: shminux 11 December 2012 11:50:08PM 2 points [-]

The other side of this is that I would expect my brain to NOTICE it's actual goals. If my goal is to make paperclips, I will think "I should do this because it makes paperclips", instead of "I should do this because it makes people happy".

Secondary goals often feel like primary. Breathing and quenching thirst are means of achieving the primary goal of survival (and procreation), yet they themselves feel like primary. Similarly, a paperclip maximizer may feel compelled to harvest iron without any awareness that it wants to do it in order to produce paperclips.

Comment author: handoflixue 14 December 2012 06:53:07PM 3 points [-]

Bull! I'm quite aware of why I eat, breathe, and drink. Why in the world would a paperclip maximizer not be aware of this?

Unless you assume Paperclippers are just rock-bottom stupid I'd also expect them to eventually notice the correlation between mining iron, smelting it, and shaping it in to a weird semi-spiral design... and the sudden rise in the number of paperclips in the world.

Comment author: shminux 14 December 2012 07:36:16PM *  1 point [-]

I'm not sure that awareness is needed for paperclip maximizing. For example, one might call fire a very good CO2 maximizer. Actually, I'm not even sure you can apply the word awareness to non-human-like optimizers.

Comment author: handoflixue 14 December 2012 10:23:40PM 0 points [-]

"If we reprogrammed you to count paperclips instead"

This is a conversation about changing my core utility function / goals, and what you are discussing would be far more of an architectural change. I meant, within my architecture (and, I assume, generalizing to most human architectures and most goals), we are, on some level, aware of the actual goal. There are occasional failure states (Alicorn mentioned iron deficiencies register as a craving for ice o.o), but these tend to tie in to low-level failures, not high-order goals like "make a paperclip", and STILL we tend to manage to identify these and learn how to achieve our actual goals.