handoflixue comments on By Which It May Be Judged - LessWrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (934)
Secondary goals often feel like primary. Breathing and quenching thirst are means of achieving the primary goal of survival (and procreation), yet they themselves feel like primary. Similarly, a paperclip maximizer may feel compelled to harvest iron without any awareness that it wants to do it in order to produce paperclips.
Bull! I'm quite aware of why I eat, breathe, and drink. Why in the world would a paperclip maximizer not be aware of this?
Unless you assume Paperclippers are just rock-bottom stupid I'd also expect them to eventually notice the correlation between mining iron, smelting it, and shaping it in to a weird semi-spiral design... and the sudden rise in the number of paperclips in the world.
I'm not sure that awareness is needed for paperclip maximizing. For example, one might call fire a very good CO2 maximizer. Actually, I'm not even sure you can apply the word awareness to non-human-like optimizers.
"If we reprogrammed you to count paperclips instead"
This is a conversation about changing my core utility function / goals, and what you are discussing would be far more of an architectural change. I meant, within my architecture (and, I assume, generalizing to most human architectures and most goals), we are, on some level, aware of the actual goal. There are occasional failure states (Alicorn mentioned iron deficiencies register as a craving for ice o.o), but these tend to tie in to low-level failures, not high-order goals like "make a paperclip", and STILL we tend to manage to identify these and learn how to achieve our actual goals.