Very basic question on CEV. Supposing humans have fundamentally disagreeing 'reflective equilibria,' does CEV attempt find a game-theoretic equilibrium (presumably which all humans would 'reflectively' agree to?)
Right, it depends on which extrapolation process is used. One of the open problems of CEV is the question of which extrapolation process to use, and why.
I've been working on metaethics/CEV research for a couple months now (publishing mostly prerequisite material) and figured I'd share some of the sources I've been using.
CEV sources.
Motivation. CEV extrapolates human motivations/desires/values/volition. As such, it will help to understand how human motivation works.
Extrapolation. Is it plausible to think that some kind of extrapolation of human motivations will converge on a single motivational set? How would extrapolation work, exactly?
Metaethics. Should we use CEV, or something else? What does 'should' mean?
Building the utility function. How can a seed AI be built? How can it read what to value?
Preserving the utility function. How can the motivations we put into a superintelligence be preserved over time and self-modifcation?
Reflective decision theory. Current decision theories tell us little about software agents that make decisions to modify their own decision-making mechanisms.
Additional suggestions welcome. I'll try to keep this page up-to-date.