What is Eliezer Yudkowsky's meta-ethical theory?

lukeprog

In You Provably Can't Trust Yourself, Eliezer tried to figured out why his audience didn't understand his meta-ethics sequence even after they had followed him through philosophy of language and quantum physics. Meta-ethics is my specialty, and I can't figure out what Eliezer's meta-ethical position is. And at least at this point, professionals like Robin Hanson and Toby Ord couldn't figure it out, either.

Part of the problem is that because Eliezer has gotten little value from professional philosophy, he writes about morality in a highly idiosyncratic way, using terms that would require reading hundreds of posts to understand. I might understand Eliezer's meta-ethics better if he would just cough up his positions on standard meta-ethical debates like cognitivism, motivation, the sources of normativity, moral epistemology, and so on. Nick Beckstead recently told me he thinks Eliezer's meta-ethical views are similar to those of Michael Smith, but I'm not seeing it.

If you think you can help me (and others) understand Eliezer's meta-ethical theory, please leave a comment!

Update: This comment by Richard Chappell made sense of Eliezer's meta-ethics for me.

In You Provably Can't Trust Yourself, Eliezer tried to figured out why his audience didn't understand his meta-ethics sequence even after they had followed him through philosophy of language and quantum physics. Meta-ethics is my specialty, and I can't figure out what Eliezer's meta-ethical position is. And at least at this point, professionals like Robin Hanson and Toby Ord couldn't figure it out, either.

If you think you can help me (and others) understand Eliezer's meta-ethical theory, please leave a comment!

Update: This comment by Richard Chappell made sense of Eliezer's meta-ethics for me.

But maybe I misunderstand how he arrives at the belief that "wrong" can be a one-place predicate.

Yeah. While I'm reasonably confident that he holds the belief, I have no confidence in any theories how he arrives at that belief.

What I have gotten from his writing on the subject is a combination of "Well, it sure seems that way to me," and "Well, if that isn't true, then I don't see any way to build a superintelligence that does the right thing, and there has to be a way to build a superintelligence that does the right thing." Neither of which I find compelling.

But there's a lot of the metaethics sequence that doesn't make much sense to me at all, so I have little confidence that what I've gotten out of it is a good representation of what's there.

It's also possible that I'm completely mistaken and he simply insists on "right" as a one-place predicate as a rhetorical trick; a way of drawing the reader's attention away from the speaker's role in that computation.

If that is the case I don't see how different agents could arrive at the same perception of right and wrong, if their preferences are fundamentally opposing, given additional computation

I am fairly sure EY would say (and I agree) that there's no reason to expect them to. Different agents with different preferences will have different beliefs about right and wrong, possibly incorrigibly different.

Humans and Babykillers as defined will simply never agree about how the universe would best be ordered, even if they come to agree (as a political exercise) on how to order the universe, without the exercise of force (as the SHFP purpose to do, for example).

(if right and wrong designate future world states).

Um.

Certainly, this model says that you can order world-states in terms of their rightness and wrongness, and there might therefore be a single possible world-state that's most right within the set of possible world-states (though there might instead be several possible world-states that are equally right and better than all other possibilities).

If there's only one such state, then I guess "right" could designate a future world state; if there are several, it could designate a set of world states.

But this depends on interpreting "right" to mean maximally right, in the same sense that "cold" could be understood to designate absolute zero. These aren't the ways we actually use these words, though.

If you just argue that we don't have free will because what is right is logically implied by cause and effect,

I don't see what the concept of free will contributes to this discussion.

I'm fairly certain that EY would reject the idea that what's right is logically implied by cause and effect, if by that you mean that an intelligence that started out without the right values could somehow infer, by analyzing causality in the world, what the right values were.

My own jury is to some degree still out on this one. I'm enough of a consequentialist to believe that an adequate understanding of cause and effect lets you express all judgments about right and wrong action in terms of more and less preferable world-states, but I cannot imagine how you could derive "preferable" from such an understanding. That said, my failure of imagination does not constitute a fact about the world.

Humans and Babykillers are not talking about the same subject matter when they debate what-to-do-next, and their doing different things does not constitute disagreement.

51

What is Eliezer Yudkowsky's meta-ethical theory?

51

51

51

What is Eliezer Yudkowsky's meta-ethical theory?

51

51