I don't understand how you can have preferences that you use to decide what ought to count as a "moral justification" without already having a moral reference frame.
Well, consider an analogy from mathematical logic: when you write out a formal proof that 2+2 = 4, at some point in the process, you'll end up concatenating two symbols here and two symbols there to produce four symbols; but this doesn't mean you're appealing to the conclusion you're trying to prove in your proof; it just so happens that your ability to produce the proof depends on the truth of the proposition.
Similarly, when an AI with Morality programmed into it computes the correct action, it just follows the Morality algorithm directly, which doesn't necessarily refer explicitly to "humans" as such. But human programmers had to program the Morality algorithm into the AI in the first place; and the reason they did so is because they themselves were running something related to the Morality algorithm in their own brains. That, as you know, doesn't imply that the AI itself is appealing to "human values" in its actual computation (the Morality program need not make such a reference); but it does imply that the meta-ethical theory used by the programmers compelled them to (in an appropriate sense) look at their own brains to decide what to program into the AI.
On Wei_Dai's complexity of values post, Toby Ord writes:
The kind of moral realist positions that apply Occam's razor to moral beliefs are a lot more extreme than most philosophers in the cited survey would sign up to, methinks. One such position that I used to have some degree of belief in is:
Strong Moral Realism: All (or perhaps just almost all) beings, human, alien or AI, when given sufficient computing power and the ability to learn science and get an accurate map-territory morphism, will agree on what physical state the universe ought to be transformed into, and therefore they will assist you in transforming it into this state.
But most modern philosophers who call themselves "realists" don't mean anything nearly this strong. They mean that that there are moral "facts", for varying definitions of "fact" that typically fade away into meaninglessness on closer examination, and actually make the same empirical predictions as antirealism.
Suppose you take up Eliezer's "realist" position. Arrangements of spacetime, matter and energy can be "good" in the sense that Eliezer has a "long-list" style definition of goodness up his sleeve, one that decides even contested object-level moral questions like whether abortion should be allowed or not, and then tests any arrangement of spacetime, matter and energy and notes to what extent it fits the criteria in Eliezer's long list, and then decrees goodness or not (possibly with a scalar rather than binary value).
This kind of "moral realism" behaves, to all extents and purposes, like antirealism.
I might compare the situation to Eliezer's blegg post: it may be that moral philosophers have a mental category for "fact" that seems to be allowed to have a value even once all of the empirically grounded surrounding concepts have been fixed. These might be concepts such as "would aliens also think this thing?", "Can it be discovered by an independent agent who hasn't communicated with you?", "Do we apply Occam's razor?", etc.
Moral beliefs might work better when they have a Grand Badge Of Authority attached to them. Once all the empirically falsifiable candidates for the Grand Badge Of Authority have been falsified, the only one left is the ungrounded category marker itself, and some people like to stick this on their object level morals and call themselves "realists".
Personally, I prefer to call a spade a spade, but I don't want to get into an argument about the value of an ungrounded category marker. Suffice it to say that for any practical matter, the only parts of the map we should argue about are parts that map-onto a part of the territory.