wedrifid comments on To what degree do we have goals? - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (52)
The post cites being upset or angry as evidence of certain apparent preferences being closer to genuine preferences, but a paperclip maximizer wouldn't get upset or angry if a supernova destroyed some of its factories, for example. I think being upset or angry when one's consciously held goals have been frustrated is probably just a signaling mechanism, and not evidence of anything beyond the fact that those goals are consciously held (or "approved" or "endorsed").
I probably wouldn't either. It sounds like the sort of amortized risk that I would have accounted for when I spread the factories out through thousands of star systems. The anger would come in only when the destruction was caused by another optimising entity. And more specifically by another entity that I have modelled as 'agenty' and not one that I have intuitively objectified.