What is Eliezer Yudkowsky's meta-ethical theory?

lukeprog

In You Provably Can't Trust Yourself, Eliezer tried to figured out why his audience didn't understand his meta-ethics sequence even after they had followed him through philosophy of language and quantum physics. Meta-ethics is my specialty, and I can't figure out what Eliezer's meta-ethical position is. And at least at this point, professionals like Robin Hanson and Toby Ord couldn't figure it out, either.

Part of the problem is that because Eliezer has gotten little value from professional philosophy, he writes about morality in a highly idiosyncratic way, using terms that would require reading hundreds of posts to understand. I might understand Eliezer's meta-ethics better if he would just cough up his positions on standard meta-ethical debates like cognitivism, motivation, the sources of normativity, moral epistemology, and so on. Nick Beckstead recently told me he thinks Eliezer's meta-ethical views are similar to those of Michael Smith, but I'm not seeing it.

If you think you can help me (and others) understand Eliezer's meta-ethical theory, please leave a comment!

Update: This comment by Richard Chappell made sense of Eliezer's meta-ethics for me.

In You Provably Can't Trust Yourself, Eliezer tried to figured out why his audience didn't understand his meta-ethics sequence even after they had followed him through philosophy of language and quantum physics. Meta-ethics is my specialty, and I can't figure out what Eliezer's meta-ethical position is. And at least at this point, professionals like Robin Hanson and Toby Ord couldn't figure it out, either.

If you think you can help me (and others) understand Eliezer's meta-ethical theory, please leave a comment!

Update: This comment by Richard Chappell made sense of Eliezer's meta-ethics for me.

(2) The semantic tricks merely shift the lump under the rug, they don't get rid of it. Standard worries about relativism re-emerge, e.g. an agent can know a priori that their own fundamental values are right, given how the meaning of the word 'right' is determined. This kind of (even merely 'fundamental') infallibility seems implausible.

EY bites this bullet in the abstract, but notes that it does not apply to humans. An AI with a simple utility function and full ability to analyze its own source code can be quite sure that maximizing that function is the meaning of "that-AI-right" in the sense EY is talking about.

But there is no analogue to that situation in human psychology, given how much we now know about self-deception, our conscious and unconscious mental machinery, and the increasing complexity of our values the more we think on them. We can, it's true, say that "the correct extrapolation of my fundamental values is what's right for me to do", but this doesn't guarantee whether value X is or is not a member of that set. The actual work of extrapolating human values (through moral arguments and other methods) still has to be done.

So practical objections to this sort of bullet-biting don't apply to this metaethics; are there any important theoretical objections?

EDIT: Changed "right" to "that-AI-right". Important clarification.

An AI with a simple utility function and full ability to analyze its own source code can be quite sure that maximizing that function is the meaning of "that-AI-right" in the sense EY is talking about.

I don't think that's right, or EY's position (I'd like evidence on that). Who's to say that maximization is precisely what's right? That might be a very good heuristic, but upon reflection the AI might decide to self-improve in a way that changes this subgoal (of the overall decision problem that includes all the other decision-making parts), by f... (read more)

0TheOtherDave15y

Agreed that on EY's view (and my own), human "fundamental values" (1) have not yet been fully articulated/extrapolated; that we can't say with confidence whether X is in that set. But AFAICT, EY rejects the idea (which you seem here to claim that he endorses?) that an AI with a simple utility function can be sure that maximizing that function is the right thing to do. It might believe that maximizing that function is the right thing to do, but it would be wrong. (2) AFAICT this is precisely what RichardChappell considers implausible: the idea that unlike the AI, humans can correctly believe that maximizing their utility function is the right thing to do. == (1) Supposing there exist any such things, of which I am not convinced. (2) Necessarily wrong, in fact, since on EY's view as I understand it there's one and only one right set of values, and humans currently implement it, and the set of values humans implement is irreducably complex and therefore cannot be captured by a simple utility function. Therefore, an AI maximizing a simple utility function is necessarily not doing the right thing on EY's view.

51

What is Eliezer Yudkowsky's meta-ethical theory?

51

51

51

What is Eliezer Yudkowsky's meta-ethical theory?

51

51