What is Eliezer Yudkowsky's meta-ethical theory?

lukeprog

In You Provably Can't Trust Yourself, Eliezer tried to figured out why his audience didn't understand his meta-ethics sequence even after they had followed him through philosophy of language and quantum physics. Meta-ethics is my specialty, and I can't figure out what Eliezer's meta-ethical position is. And at least at this point, professionals like Robin Hanson and Toby Ord couldn't figure it out, either.

Part of the problem is that because Eliezer has gotten little value from professional philosophy, he writes about morality in a highly idiosyncratic way, using terms that would require reading hundreds of posts to understand. I might understand Eliezer's meta-ethics better if he would just cough up his positions on standard meta-ethical debates like cognitivism, motivation, the sources of normativity, moral epistemology, and so on. Nick Beckstead recently told me he thinks Eliezer's meta-ethical views are similar to those of Michael Smith, but I'm not seeing it.

If you think you can help me (and others) understand Eliezer's meta-ethical theory, please leave a comment!

Update: This comment by Richard Chappell made sense of Eliezer's meta-ethics for me.

In You Provably Can't Trust Yourself, Eliezer tried to figured out why his audience didn't understand his meta-ethics sequence even after they had followed him through philosophy of language and quantum physics. Meta-ethics is my specialty, and I can't figure out what Eliezer's meta-ethical position is. And at least at this point, professionals like Robin Hanson and Toby Ord couldn't figure it out, either.

If you think you can help me (and others) understand Eliezer's meta-ethical theory, please leave a comment!

Update: This comment by Richard Chappell made sense of Eliezer's meta-ethics for me.

(Note: This comment contains positions which came from my mind without an origin tag attached. I don't remember reading anything by Eliezer which directly disagrees with this, but I don't represent this as anyone's position but my own.)

"Standard" utilitarianism works by defining a separate per-agent utility functions to represent each person's preferences, and averaging (or summing) them to produce a composite utility function which every utilitarianism is supposed to optimize. The exact details of what the per-agent utility functions look like, and how you combine them, differ from flavor to flavor. However, this structure - splitting the utility function up into per-agent utility functions plus an agent utility function - is wrong. I don't know what a utility function that fully captured human values would look like, but I do know that it can't be split and composed this way.

It breaks down most obviously when you start varying the number of agents; in the variant where you sum up utilities, an outcome where many people live lives just barely worth living seems better than an outcome where fewer people live amazingly good lives (but we actually prefer the latter); in the variant where you average utilities, an outcome where only one person exists but he lives an extra-awesome life is better than an outcome where many people lead merely-awesome lives.

Split-agent utility functions are also poorly equipped to deal with the problem of weighing agents against each other. if there's a scenario where one person's utility function diverges to infinity, then both sum- and average-utility aggregation claim that it's worth sacrificing everyone else to make sure that happens (the "utility monster" problem).

And the thing is, writing a utility function that captures human values is a hard and unsolved problem, and splitting it up by agent doesn't actually bring us any closer; defining the single-agent function is just as hard as defining the whole thing.

in the variant where you sum up utilities, an outcome where many people live lives just barely worth living seems better than an outcome where fewer people live amazingly good lives (but we actually prefer the latter);

Are you sure of this? It sounds a lot like scope insensitivity. Remember, lives barely worth living are still worth living.

if there's a scenario where one person's utility function diverges to infinity, then both sum- and average-utility aggregation claim that it's worth sacrificing everyone else to make sure that happens (the "utility monster" problem).

Again, this seems like scope insensitivity.

4Matt_Simpson15y

I was about to cite the same sorts of things to explain why they DO disagree about what is good and bad. In other words, I agree with you about utilitarianism being wrong about the structure of ethics in precisely the way you described, but I think that also entails utilitarianism coming to different concrete ethical conclusions. If a murderer really likes murdering - it's truly a terminal value - the utilitarian HAS to take that into account. On Eliezer's theory, this need not be so. So you can construct a hypothetical where the utilitarian has to allow someone to be murdered simply to satisfy a (or many) murderer's preference where on Eliezer's theory, nothing of this nature has to be done.

51

What is Eliezer Yudkowsky's meta-ethical theory?

51

51

51

What is Eliezer Yudkowsky's meta-ethical theory?

51

51