How will the AI behave when it is still gathering information and computing the CEV (or any other meta-level solution)? For example, in the case of CEV, won't it pick the most efficient, not the rightest, method to scan brains, compute the CEV, etc?
Do we (need to) know what mechanism or knowledge the AI would need to approximate ethical behavior when it still doesn't know exactly what friendliness means?
An excellent point.
CEV is our current proposal for what ought to be done once you have AGI flourishing around. Many people have had bad feelings about this. When in Singularity Institute, I decided to write a text do discuss CEV, from what it is for, to how likely it is to achieve it's goals, and how much fine-grained detail needs to be added before it is an actual theory.
Here you find a draft of the topics I'll be discussing in that text. The purpose of showing this is that you take a look at the topics, spot something that is missing, and write a comment saying: "Hey, you forgot this problem, which, summarised, is bla bla bla bla" and also "be sure to mention paper X when discussing topic 2.a.i,"
Please take a few minutes to help me add better discussions.
Do not worry about pointing previous Less Wrong posts about it, I have them all.