I think that one of the problem with CEV is that we still have to say who's values we want to extrapolate, and it almost defines the outcome.
For example CEV of values of C. elegance is not equal to human values.
The main problem here is how we defined to be human. Here are most value groups differ. Will we include unborn children? Neanderthals?
I mean that different value system have different definition of human beings, and "human CEV" is different for green party member who include all animals in "humans", and to neo-something who may exclude some people from definition of humans. So AI could correctly calculate CEV but to the wrong group.
I agree with a twist: I think CEV will mostly be uninteresting, because it will have to integrate so many conflicting points of view that it will mostly come up with "do nothing at all".
Brainstorming a bit, I would say that value alignment is impossible unless an AI becomes actively part of the moral landscape: instead of being a slave to a hypothetical human uber-value, it will need to interact heavily with humans and force them to act so to reveal their true preferences or collaborate to generate a utopia.
If it's worth saying, but not worth its own post, then it goes here.
Notes for future OT posters:
1. Please add the 'open_thread' tag.
2. Check if there is an active Open Thread before posting a new one. (Immediately before; refresh the list-of-threads page before posting.)
3. Open Threads should start on Monday, and end on Sunday.
4. Unflag the two options "Notify me of new top level comments on this article" and "