From the CEV paper:
Different classes of satisfactory initial definitions may fall into different selfconsistent attractors for optimal definitions of volition. Or they may all converge to essentially the same endpoint. A CEV might survey the “space” of initial dynamics and self-consistent final dynamics, looking to see if one alternative obviously stands out as best; extrapolating the opinions humane philosophers might have of that space. But if there are multiple, self-consistent, satisficing endpoints, each of them optimal under their own criterion—okay. Whatever. As long as we end up in a Nice Place to Live.
And yes, the programmers’ choices may have a huge impact on the ultimate destiny of the human species. Or a bird, chirping in the programmers’ window. Or a science fiction novel, or a few lines spoken by a character in an anime, or a webcomic. Life is chaotic, small things have large effects. So it goes.
Which you could sum up as "CEV doesn't get around that problem, it treats it as irrelevant - the point isn't to find a particular good solution that's unique and totally non-arbitrary, it's just to find even one of the good solutions. If arbitrary reasons shift us from Good World #4 to Good World #36, who cares as long as they both really are good worlds".
Subscribe to RSS Feed
= f037147d6e6c911a85753b9abdedda8d)
My recent paper touches upon preference aggregation a bit in section 8, BTW, though it's mostly focused on the question of figuring out a single individual's values. (Not sure how relevant that is for your comments, but thought maybe a little.)
Thanks, I'll look into it.
(And all my ranting still didn't address the fundamental difficulty: There is no rational way to choose from among different projections of values held by multiple agents, projections such as Rawlsianism and utilitarianism.)