The Urgent Meta-Ethics of Friendly Artificial Intelligence

lukeprog

Barring a major collapse of human civilization (due to nuclear war, asteroid impact, etc.), many experts expect the intelligence explosion Singularity to occur within 50-200 years.

That fact means that many philosophical problems, about which philosophers have argued for millennia, are suddenly very urgent.

Those concerned with the fate of the galaxy must say to the philosophers: "Too slow! Stop screwing around with transcendental ethics and qualitative epistemologies! Start thinking with the precision of an AI researcher and solve these problems!"

If a near-future AI will determine the fate of the galaxy, we need to figure out what values we ought to give it. Should it ensure animal welfare? Is growing the human population a good thing?

But those are questions of applied ethics. More fundamental are the questions about which normative ethics to give the AI: How would the AI decide if animal welfare or large human populations were good? What rulebook should it use to answer novel moral questions that arise in the future?

But even more fundamental are the questions of meta-ethics. What do moral terms mean? Do moral facts exist? What justifies one normative rulebook over the other?

The answers to these meta-ethical questions will determine the answers to the questions of normative ethics, which, if we are successful in planning the intelligence explosion, will determine the fate of the galaxy.

Eliezer Yudkowsky has put forward one meta-ethical theory, which informs his plan for Friendly AI: Coherent Extrapolated Volition. But what if that meta-ethical theory is wrong? The galaxy is at stake.

Princeton philosopher Richard Chappell worries about how Eliezer's meta-ethical theory depends on rigid designation, which in this context may amount to something like a semantic "trick." Previously and independently, an Oxford philosopher expressed the same worry to me in private.

Eliezer's theory also employs something like the method of reflective equilibrium, about which there are many grave concerns from Eliezer's fellow naturalists, including Richard Brandt, Richard Hare, Robert Cummins, Stephen Stich, and others.

My point is not to beat up on Eliezer's meta-ethical views. I don't even know if they're wrong. Eliezer is wickedly smart. He is highly trained in the skills of overcoming biases and properly proportioning beliefs to the evidence. He thinks with the precision of an AI researcher. In my opinion, that gives him large advantages over most philosophers. When Eliezer states and defends a particular view, I take that as significant Bayesian evidence for reforming my beliefs.

Rather, my point is that we need lots of smart people working on these meta-ethical questions. We need to solve these problems, and quickly. The universe will not wait for the pace of traditional philosophy to catch up.

Barring a major collapse of human civilization (due to nuclear war, asteroid impact, etc.), many experts expect the intelligence explosion Singularity to occur within 50-200 years.

That fact means that many philosophical problems, about which philosophers have argued for millennia, are suddenly very urgent.

If a near-future AI will determine the fate of the galaxy, we need to figure out what values we ought to give it. Should it ensure animal welfare? Is growing the human population a good thing?

But even more fundamental are the questions of meta-ethics. What do moral terms mean? Do moral facts exist? What justifies one normative rulebook over the other?

Such as? Where is this [deontological] machinery?

I was referring, for instance, to the point that there are evolutionary reasons why we'd expect to find (as we do) that an understanding of deontological injunctions is fairly universal among humans.

EY's theory linked in the 1st post that deontological injunctions evolved as some sort of additional defense against black swan events does not appear especially convincing to me. The cortex is intrinsically predictive consequentialist at a low level, but simple deontological rules are vast computational shortcuts.

An animal brain learns the hard way, the way AIXI does, thoroughly consequentialist at first, but once predictable pattern matches are learned at higher levels they can be sometimes simplified down to simpler rules for quick decisions.

Even non-verbal animals find ways to pass down some knowledge to their offspring, but in humans this is vastly amplified through language.

Every time a parent tells a child what to do, the parent is transmitting complex consequentualist results down to the younger mind in the form of simpler cached deontological behaviors. Ex: It would be painful for the child to learn a firsthand consequentualist account of why stealing is detrimental (the tribe will punish you).

Once this machinery was in place, it could extend over generations and develop into more complex cultural and religious deontologies. All of this can be accomplished through cortical reinforcement learning as the child develops.

Feral children, for all intents and purposes, act like feral animals. Human minds are cultural/linguistic software phenomena.

Not to mention that conveying a concept to a human carries no instructions; programming concepts into an AI is all instructions

I'm not aware of any practical approach to AI which consists of programming concepts directly into an AI. All modern approaches program only the equivalent of an empty brain, the concepts and resulting mind forms through learning.

Humans concepts are expressed in natural language, and for an AGI to compete with humans it will need to learn extant human knowledge. Learning natural language thus seems like the most practical approach.

"Expected utility maximisation" is, by definition what actually represents our best outcome. To the extent that it doesn't, it is a failure of our ability to grasp and apply the concept, not a failure in the concept itself.

The problem is this: if we define an algorithm to represent our best outcome and use that as the standard of rationality, and the algorithm's predictions then differ significantly from actual human decisions: is it a problem with the algorithm or the human mind?

If we had an algorithm that represented a human mind perfectly, then that mind would always be rational by that definition.

Even if deontological injunctions are only transmitted through language, they are based on human predispositions (read brain wiring) to act morally and cooperate, which has evolved.

This somewhat applies to animals too, there's been research on altruism in animals.

75

The Urgent Meta-Ethics of Friendly Artificial Intelligence

75

75

75

The Urgent Meta-Ethics of Friendly Artificial Intelligence

75

75