Two questions about CEV that worry me

cousin_it

Taken from some old comments of mine that never did get a satisfactory answer.

1) One of the justifications for CEV was that extrapolating from an American in the 21st century and from Archimedes of Syracuse should give similar results. This seems to assume that change in human values over time is mostly "progress" rather than drift. Do we have any evidence for that, except saying that our modern values are "good" according to themselves, so whatever historical process led to them must have been "progress"?

2) How can anyone sincerely want to build an AI that fulfills anything except their own current, personal volition? If Eliezer wants the the AI to look at humanity and infer its best wishes for the future, why can't he task it with looking at himself and inferring his best idea to fulfill humanity's wishes? Why must this particular thing be spelled out in a document like CEV and not left to the mysterious magic of "intelligence", and what other such things are there?

Taken from some old comments of mine that never did get a satisfactory answer.

If we find out that a value only exists because of a historical accident plus status quo bias, and that any society where it wasn't the status quo would reject it when it was explained to them, then we should reject that value.

This standard allows us to throw away all values not directly linked to inclusive genetic fitness, and maybe even those that are. There's no objective morality.

Huh? We must have a difference of definitions somewhere, because that's not what I think my argument says at all.

The fact that my volition might just consist of a pointer to CEV does not seem like much of an argument for choosing it over CEV, given that my volition also includes lots of poorly-understood other stuff, which I won't get a chance to inspect if there's no extrapolation, and which is more likely to make things worse than to make them better.

This argument works just as well for defending concrete wishes ("volcano lair with catgirls") over CEV.

No, it doesn't. This was a counterargument to the could-be-a-pointer argument, not a root-level argument; and if you expand it out, it actually favors CEV over concrete wishes, not the reverse.

The could-be-a-pointer argument is that since one person's volition might just be the desire to have CEV implemented, so that one person's volition is at least as good as CEV. But this is wrong, because that person's volition will also include lots of other stuff, which is substantially random and so at least some of it will be bad. So you need to filter (extrapolate) those desires to get only the good ones. One way we could filter them is by throwing out everything except for a few concrete wishes, but that is not the best possible filter because it will throw out many aspects of volition that are good (and probably also necessary for preventing disastrous misinterpretations of the concrete wishes).

38

Two questions about CEV that worry me

38

38

38

Two questions about CEV that worry me

38

38