Yvain comments on A Brief Overview of Machine Ethics - Less Wrong

6 Post author: lukeprog 05 March 2011 06:09AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (90)

You are viewing a single comment's thread. Show more comments above.

Comment author: Yvain 05 March 2011 04:34:21PM *  14 points [-]

Allen - Prolegomena to Any Future Moral Agent places a lot of emphasis on figuring out of a machine can be truly moral, in various metaphysical senses like "has the capacity to disobey the law, but doesn't" and "deliberates in a certain way". Not only is it possible that these are meaningless, but in a superintelligence the metaphysical implications should really take second-place to the not-getting-turned-into-paperclips implications.

He proposes a moral Turing Test, where we call a machine moral if it can answer moral questions indistinguishably from a human. But Clippy would also pass this test, if a consequence of passing was that the humans lowered their guard/let him out of the box. In fact, every unfriendly superintelligence with a basic knowledge of human culture and a motive would pass.

Utilitarianism considered difficult to implement because it's computationally impossible to predict all consequences. Given that any AI worth its salt would have a module for predicting the consequences of its actions anyway, and that the potential danger of the AI is directly related to how good this module is, that seems like a non-problem. It wouldn't be perfect, but it would do better than humans, at least.

Deontology, same problem as the last one. Virtue ethics seems problematic depending on the AI's motivation - if it were motivated to turn the universe to paperclips, would it be completely honest about it, kill humans quickly and painlessly and with a flowery apology, and declare itself to have exercised the virtues of honesty, compassion, and politeness? Evolution would give us something at best as moral as humans and probably worse - see the Sequence post about the tanks in cloudy weather.

Still not impressed.

Comment author: Yvain 05 March 2011 04:46:11PM *  9 points [-]

Mechanized Deontic Logic is pretty okay, despite the dread I had because of the name. I'm no good at formal systems, but as far as I can understand it looks like a logic for proving some simple results about morality: the example they give is "If you should see to it that X, then you should see to it that you should see to it that X."

I can't immediately see a way this would destroy the human race, but that's only because it's nowhere near the point where it involves what humans actually think of as "morality" yet.

Comment author: Yvain 05 March 2011 05:03:18PM *  11 points [-]

Utilibot Project is about creating a personal care robot that will avoid accidentally killing its owner by representing the goal of "owner health" in a utilitarian way. It sounds like it might work for a robot with a very small list of potential actions (like "turn on stove" and "administer glucose") and a very specific list of owner health indicators (like "hunger" and "blood glucose level"), but it's not very relevant to the broader Friendly AI program.

Having read as many papers as I have time to before dinner, my provisional conclusion is that Vladimir Nesov hit the nail on the head