nshepperd comments on Moral Error and Moral Disagreement - Less Wrong

14 Post author: Eliezer_Yudkowsky 10 August 2008 11:32PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (125)

Sort By: Old

You are viewing a single comment's thread. Show more comments above.

Comment author: cousin_it 18 November 2010 07:31:46AM *  12 points [-]

unilaterally label friendly

I love your turn of phrase, it has a Cold War ring to it.

The question why anyone would ever sincerely want to build an AI which extrapolates anything other than their personal volition is still unclear to me. It hinges on the definition of "sincerely want". If Eliezer can task the AI with looking at humanity and inferring its best wishes, why can't he task it with looking at himself and inferring his best idea of how to infer humanity's wishes? How do we determine, in general, which things a document like CEV must spell out and which things can/should be left to the mysterious magic of "intelligence"?

Comment author: nshepperd 20 November 2010 11:43:14AM 2 points [-]

A CEV optimizer is less likely to do horrific things while its ability to extrapolate volition is "weak". If it can't extrapolate far from the unwise preferences people have now with the resources it has, it will notice that the EV varies a lot among the population, and take no action. Or if the extrapolation system has a bug in it, this will hopefully show up as well. So coherence is a kind of "sanity test".

That's one reason that leaps to mind anyway.

Of course the other is that there is no evidence any single human is Friendly anyway, so cooperation would be impossible among EV maximizing AI researchers. As such, an AI that maximizes EV<Eliezer> is out of the question already. CEV<humanity> is the next best thing.