A Brief Overview of Machine Ethics

lukeprog

Earlier, I lamented that even though Eliezer named scholarship as one of the Twelve Virtues of Rationality, there is surprisingly little interest in (or citing of) the academic literature on some of Less Wrong's central discussion topics.

Previously, I provided an overview of formal epistemology, that field of philosophy that deals with (1) mathematically formalizing concepts related to induction, belief, choice, and action, and (2) arguing about the foundations of probability, statistics, game theory, decision theory, and algorithmic learning theory.

Now, I've written Machine Ethics is the Future, an introduction to machine ethics, the academic field that studies the problem of how to design artificial moral agents that act ethically (along with a few related problems). There, you will find PDFs of a dozen papers on the subject.

Enjoy!

You seem to be under the impression that Eliezer is going to create an artificial general intelligence, and oversight is necessary to ensure that he doesn't create one which places his goals over humanity's interests. It is important, you say, that he is not allowed unchecked power. This is all fine, except for one very important fact that you've missed.

Eliezer Yudkowsky can't program. He's never published a nontrivial piece of software, and doesn't spend time coding. In the one way that matters, he's a muggle. Ineligible to write an AI. Eliezer has not positioned himself to be the hero, the one who writes the AI or implements its utility function. The hero, if there is to be one, has not yet appeared on stage. No, Eliezer has positioned himself to be the mysterious old wizard - to lay out a path, and let someone else follow it. You want there to be oversight over Eliezer, and Eliezer wants to be the oversight over someone else to be determined.

But maybe we shouldn't trust Eliezer to be the mysterious old wizard, either. If the hero/AI programmer comes to him with a seed AI, then he knows it exists, and finding out that a seed AI exists before it launches is the hardest part of any plan to steal it and rewrite its utility function to conquer the universe. That would be pretty evil, but would "transparency and oversight" make things turn out better, or worse? As far as I can tell, transparency would mean announcing the existence of a pre-launch AI to the world. This wouldn't stop Eliezer from make a play to conquer the universe, but it would present that option to everybody else, including at least some people and organizations who are definitely evil.

So that's a bad plan. A better plan would be to write a seed AI yourself, keep it secret from Eliezer, and when it's time to launch, ask for my input instead.

(For the record: I've programmed in C++, Python, Java, wrote some BASIC programs on a ZX80 when I was 5 or 6, and once very briefly when MacOS System 6 required it I wrote several lines of a program in 68K assembly. I admit I haven't done much coding recently, due to other comparative advantages beating that one out.)

1XiXiDu15y

I disagree based on the following evidence: [...] You further write: [...] I'm not aware of any reason to believe that recursively self-improving artificial general intelligence is going to be something you can 'run away with'. It looks like some people here think so, that there will be some kind of, with hindsight, simple algorithm for intelligence that people can just run and get superhuman intelligence. Indeed, transparency could be very dangerous in that case. But that doesn't mean it is an all or nothing decision. There are many other reasons for transparency, including reassurance and the ability to discern a trickster or impotent individual from someone who deserves more money. But as I said, I don't see that anyway. It'll more likely be a blue sheet of different achievements that are each not dangerous on their own. I further think it will be not just a software solution but also a conceptual and computational revolution. In those cases an open approach will allow public oversight. And even if someone is going to run with it, you want them to use your solution rather than one that will most certainly be unfriendly.

10

A Brief Overview of Machine Ethics

10

10

10

A Brief Overview of Machine Ethics

10

10