A recent post at my blog may be interesting to LW. It is a high-level discussion of what precisely defined value extrapolation might look like. I mostly wrote the essay while a visitor at FHI.
The basic idea is that we can define extrapolated values by just taking an emulation of a human, putting it in a hypothetical environment with access to powerful resources, and then adopting whatever values it eventually decides on. You might want some philosophical insight before launching into such a definition, but since we are currently laboring under the threat of catastrophe, it seems that there is virtue in spending our effort on avoiding death and delegating whatever philosophical work we can to someone on a more relaxed schedule.
You wouldn't want to run an AI with the values I lay out, but at least it is pinned down precisely. We can articulate objections relatively concretely, and hopefully begin to understand/address the difficulties.
(Posted at the request of cousin_it.)
I disagree; reading Paul's description made it clear to me how superficial it is to want to solve a problem by creating an army of uploads to do it for you. You may as well just try to solve the problem here and now, rather than hoping to outsource it to a bunch of nonexistent human-simulations running on nonexistent hardware. The only reason to consider such a baroque way of solving a problem is if you expect to be very pressed for time and yet to also have access to superdupercomputing power. You know, the world is hurtling towards singularity, no-one has crossed the finish line but many people are getting close, your FAI research organization manages to get a hold of a few petaflops on which to run a truncated AIXI problem-solver... and now you can finally go dig up that scrap of paper on which your team wrote down, years before, the perfectly optimal wish: "I want you, FAI-precursor, to do what the ethically stabilized members of our team would do, if they had hundreds of years to think about it, and if they...", etcetera.
It's a logically possible scenario, but is it remotely likely? This absolutely should not be the paradigm for a successful implementation of FAI or CEV. It's just a wacky contingency that you might want to spend a little time thinking about. The plan should be that un-uploaded people will figure out what to do. They will surely make intensive use of computers, and there may be some big final calculation in which the schematics of human genetic, neural and cultural architecture are the inputs to a reflective optimization process; but you shouldn't imagine that, like some bunch of Greg Egan characters, the researchers are going to successfully upload themselves and then figure out the logistics and the mathematics of a successful CEV process. It's like deciding to fix global warming by building a city on the moon that will be devoted to the task of solving global warming.
Could you express your objection more precisely than "it's wacky"?