CCC comments on By Which It May Be Judged - LessWrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (934)
If one a single agent has conflicting desires (each of which it values equally) then it should work to alter its desires, so it chooses consistent desires that are most likely to be fulfilled.
To your latter question though, I think that what you're asking is "If two agents have utility functions that clash, which one is to be preferred?" Is it that all we can say is "Whichever one has the most resources and most optimisation power/intelligence will be able to put its goals into action and prevent the other one from fully acting upon its"?
Well, I think that the point Eliezer has talked about a few times before is that there is no ultimate morality, written into the universe that will affect any agent so as to act it out. You can't reason with an agent which has a totally different utility function. The only reason that we can argue with humans is that they're only human, and thus we share many desires. Figuring out morality isn't going to give you the powers to talk down Clippy from killing you for more paper clips. You aren't going to show how human 'morality', which actualises what humans prefer, is any more preferable than 'Clippy' ethics. He is just going to kill you.
So, let's now figure out exactly what we want most, (if we had our own CEV) and then go out and do it. Nobody else is gonna do it for us.
EDIT: First sentence 'conflicting desires'; I meant to say 'in principle unresolvable' like 'x' and '~x'. Of course, for most situations, you have multiple desires that clash, and you just have to perform utility calculations to figure out what to do.
If you know (or correctly guess) the agents' utility function, and are able to communicate with it, then it may well be possible to reason with it.
Consider this situation; I am captured by a Paperclipper, which wishes to extract the iron from my blood and use it to make more paperclips (incidentally killing me in the process). I can attempt to escape by promising to send to the Paperclipper a quantity of iron - substantially more than can be found in my blood, and easier to extract - as soon as I am safe. As long as I can convince Clippy that I will follow through on my promise, I have a chance of living.
I can't talk Clippy into adopting my own morality. But I can talk Clippy into performing individual actions that I would prefer Clippy to do (or into refraining from other actions) as long as I ensure that Clippy can get more paperclips by doing what I ask than by not doing what I ask.
Of course - my mistake. I meant that you can't alter an agent's desires by reason alone. You can't appeal to desires you have. You can only appeal to its desires. So, when he's going to turn the your blood iron into paperclips, and you want to live, you can't try "But I want to live a long and happy life!". If Clippy hasn't got empathy, and you have nothing to offer that will help fulfill his own desires, then there's nothing to be done, other than try to physical stop or kill him.
Maybe you'd be happier if you put him in a planet of his own, where a machine constantly destroye paperclips, and he was happy making new ones. My point is just that, if you do decide to make him happy, it's not the optimal decision relative to a universal preference, or morality. It's just the optimal decision relative to your desires. Is that 'right'? Yes. That's what we refer to, when we say 'right'.