DanArmak comments on Holden's Objection 1: Friendliness is dangerous - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (428)
Lots of people honestly wish for the literal end of the universe to come, because they believe in an afterlife/prophecy/etc.
You might say they would change their minds given better or more knowledge (e.g. that there is no afterlife and the prophecy was false/fake/wrong). But such people are often exposed to such arguments and reject them; and they make great efforts to preserve their current beliefs in the face of evidence. And they say these beliefs are very important to them.
There may well be methods of "converting" them anyway, but how are these methods ethically or practically different from "forcibly changing their minds" or their values? And if you're OK with forcibly changing their minds, why do you think that's ethically better than just ignoring them and building a partial-CEV that only extrapolates your own wishes and those of people similar to yourself?
I (and CEV) do not propose changing their minds or their values. What happens is that their current values (as modeled within FAI) get corrected in the presence of truer knowledge and lots of intelligence, and these corrected values are used for guiding the FAI.
If someone's mind & values are so closed as to be unextrapolateable - completely incompatible with truth - then I'm ok with ignoring these particular persons. But I don't believe there are actually any such people.
So the future is built to optimize different values. And their original values aren't changed. Wouldn't they suffer living in such a future?
Even if they do, it will be the best possible thing for them, according to their own (extrapolated) values.
Who cares about their extrapolated values? Not them (they keep their original values). Not others (who have different actual and extrapolated values). Then why extrapolate their values at all? You could very easily build a much happier life for them just by allocating some resources (land, computronium, whatever) and going by their current values.
Well... ok, lets assume a happy life is their single terminal value. Then by definition of their extrapolated values, you couldn't build a happier life for them if you did anything else other than follow their extrapolated values!
This is completely wrong. People are happy, by definition, if their actual values are fulfilled; not if some conflicting extrapolated values are fulfilled. CEV was supposed to get around this by proposing (without saying how) that people would actually grow to become smarter etc. and thereby modify their actual values to match the extrapolated ones, and then they'd be happy in a universe optimized for the extrapolated (now actual) values. But you say you don't want to change other people's values to match the extrapolation. That makes CEV a very bad idea - most people will be miserable, probably including you!
Yes, but values depend on knowledge. There was an example by EY, I forgot where, in which someone values a blue box because they think the blue box contains a diamond. But if they're wrong, and it's actually the red box that contains the diamond, then what would actually make them happy - giving them the blue or the red box? And would you say giving them the red box is making them suffer?
Well, perhaps yes. Therefore, a good extrapolated wish would include constraints on the speed of its own fulfillment: allow the person to take the blue box, then convince them that it is the red box they actually want, and only then present it. But in cases where this is impossible (example: blue box contains horrible violent death), then it is wrong to say that following the extrapolated values (withholding the blue box) is making the person suffer. Following their extrapolated values is the only way to allow them to have a happy life.
What you are saying indeed applies only "in cases where this is impossible". I further suggest that these are extremely rare cases when a superhumanly-powerful AI is in charge. If the blue box contains horrible violent death, the AI would build a new (third) box, put a diamond inside, paint it blue, and give it to the person.
It the AI could do this, then this is exactly what the extrapolated values would tell it to do. [Assuming some natural constraints on the original values].