Waser's 3 Goals of Morality

mwaser

In the spirit of Asimov’s 3 Laws of Robotics

You should not be selfish
You should not be short-sighted or over-optimize
You should maximize the progress towards and fulfillment of all conscious and willed goals, both in terms of numbers and diversity equally, both yours and those of others equally

It is my contention that Yudkowsky’s CEV converges to the following 3 points:

I want what I want
I recognize my obligatorily gregarious nature; realize that ethics and improving the community is the community’s most rational path towards maximizing the progress towards and fulfillment of everyone’s goals; and realize that to be rational and effective the community should punish anyone who is not being ethical or improving the community (even if the punishment is “merely” withholding help and cooperation)
I shall, therefore, be ethical and improve the community in order to obtain assistance, prevent interference, and most effectively achieve my goals

I further contend that, if this CEV is translated to the 3 Goals above and implemented in a Yudkowskian Benevolent Goal Architecture (BGA), that the result would be a Friendly AI.

It should be noted that evolution and history say that cooperation and ethics are stable attractors while submitting to slavery (when you don’t have to) is not. This formulation expands Singer’s Circles of Morality as far as they’ll go and tries to eliminate irrational Us-Them distinctions based on anything other than optimizing goals for everyone — the same direction that humanity seems headed in and exactly where current SIAI proposals come up short.

Once again, cross-posted here on my blog (unlike my last article, I have no idea whether this will be karma'd out of existence or not ;-)

This is too confused to follow as a human, and much too confused to program an AI with.

Also ambiguity aside, (2) is just bad. I'm having trouble imagining a concrete interpretation of "don't over-optimize" that doesn't reduce to "fail to improve things that should be improved". And while short-sightedness is a problem for humans who have trouble modelling the future, I don't think AIs have that problem, and there are some interesting failure modes (of the destroys-humanity variety) that arise when an AI takes too much of a long view.

Kingfisher's definition of clarity is actually not quite right. In order to be clear, you have to carve reality at the joints. That's what the problem was with the Intelligence vs. Wisdom post; there wasn't anything obviously false, at least that I noticed, but it seemed to be dividing up concept space in an unnatural way. Similarly with this post. For example, "selfish" is a natural concept for humans, who have a basic set of self-centered goals by default, which they balance against non-self-centered goals like improving their community. But if you take that definition and try to transfer it to AIs, you run into trouble, because they don't have those self-centered goals, so if you want to make sense of it you have to come up with a new definition. Is an AI that optimizes the happiness of its creator, at the expense of other humans, being selfish? How about the happiness of its creator's friends, at the expense of humanity in general? How about humanity's happiness, at the expense of other terrestrial animals?

Using fuzzy words in places where they don't belong hides a lot of complexity. One way that people respond to that is by coming up with things that the words could mean, and presenting them as counterexamples. You seem to have misinterpreted that as presenting straw-men; it's not saying that the best interpretation is wrong, but rather, saying that the phrasing was vague enough to admit some bad interpretations.

I would also like to add that detecting confusion, both in our own thoughts and in things we read, is one of the main skills of rationality. People here are, on average, much more sensitive to confusion than most people.

LESSWRONG
LW

LESSWRONG
LW

-16

Waser's 3 Goals of Morality

-16

-16