Part 7 of AI, Alignment, and Ethics. This will probably make more sense if you first read at least Part 1.

TL;DR: At several points in this sequence (including Parts 1, 3, 4, and 6) I have suggested giving a privileged role in ethics or ethical thinking to evolved organisms and to arguments derived from Evolutionary Psychology. I'd like to explain why I think that's an entirely reasonable and even obvious thing to do — despite this not being a especially common viewpoint among most recent moral philosophers, other than ethical naturalists and students of evolutionary ethics and sociobiology.

Arguably this belongs somewhere earlier in the sequence, such as somewhere between Part 1 and Part 3 — it being this late in the sequence is the product of history, not careful exposition.

From Is to Ought

At least since David Hume, moral philosophers have liked to discuss the "is-ought problem". Briefly, they claim that science is all about the ways the world is, but there is no obvious way to derive from this any statement about which way the world ought to be. To translate this into an agentic framework, a set of hypotheses about world states and how these might be affected by actions does not give us any information about a preference ordering on those world states, and thus a way to select actions, such as might be provided by, say, a utility function over world states.

This is certainly the case for Mathematics (outside Decision Theory), Physics, and Chemistry, all of which do not contain any descriptions of agentic behavior, and devote themselves entirely to statements about how the world is, the probability of world states and world state histories, and don't discuss 'ought'. In contrast, Engineering starts with a design goal, need, or specification, i.e. an ought, and devotes itself to the propagation of that out into technical means of achieving that goal. Whereas the soft sciences, which devote themselves to human (i.e. agentic) behavior and interactions, are full of many people's 'oughts' or goals and the complex interactions between them. So clearly, somewhere between Chemistry and Psychology, goals and the idea of 'ought' has appeared, along with the agents that pursue goals.

That strongly suggests that we should be looking for the origin of 'ought' from 'is' in Biology, and especially in its theoretical basis of evolution. Indeed, evolution clearly describes how goalless behavior of chemistry acquires goals: under Darwinian evolution: organisms will evolve behaviors such as homeostasis mechanisms, sensory systems, and behavioral responses that seek to achieve goals that will optimize the state of the world for specific outcomes conducive to that organism's surviving, thriving, and passing on its genes, i.e. to its evolutionary fitness. Evolved organisms show agentic behavior and act like they have goals and a purpose. (While they may not be fully rational and as impossible to Dutch-book as if they had a utility function, if another organism can evolve a behavior that lets them Dutch-book the first one, there is clearly going to be an evolutionary arms race until this is no longer possible, unless the resource costs of achieving this exceed the cost of being Dutch-bookable.)

So, it is clear how desires and goals, 'ought' and purpose, i.e. preference orders over world states, arise in Evolutionary Theory. We (even the philosophers among us) are not just abstract ivory-tower-dwelling rational minds, we are also the evolved intelligent agentic guidance systems for a specific species of living creature, the social primate Homo sapiens, so it is entirely unsurprising that we have evolved a lot of specific and detailed wants, needs, goals, and desires, and developed words to describe them,  and that these relate to evolved adaptations that are fairly good heuristics for things that would have ensured our evolutionary fitness in the environment we evolved in, as social hunter-gatherers in the African Savannah. (Indeed, with over 8 billion of us on the planet and us having almost complete dominion over every ecosystem on land other than ones we have deliberately set aside as nature preserves, plus a fairly strong influence even on many in the sea, to the point where we're calling this the Anthropocene, it's clear that, even though our evolved behaviors aren't exactly aligned to evolutionary fitness maximization in our current environment, in practice they're still doing a fine job.)

The average moral philosopher might observe that there is more to morality than just what individual people want. You may want to steal from me, but that doesn't mean that you are morally permitted to do so. The solution to this conundrum is that the niche we evolved in was as intelligent social animals, in tribes of around 50-100 individuals, who get a lot of behavioral mileage from exchanges of mutual altruism. The subfield of Evolutionary Theory devoted to the evolution of behavior is called Evolutionary Psychology, and that predicts that any social animal is going to evolve some set of instinctive views on how members of the group should interact with each other — for example, just about all social animals have some idea of what we would call 'fairness', and tend to get quite upset if other group members breach it. That is not to claim that all individuals in a social-animal group will always instinctively behave 'fairly' — rather, that if they are perceived as acting 'unfairly' by their fellow group members, those will generally respond with hostility, and thus the individual in question will only do so cautiously, when they think they can get away with it. In short, something along the lines of a simple version of the "social contract" that Hobbes, Locke, and Rousseau discussed is believed to evolve naturally.

Evolved Agents and Constructed Agents

As I discuss further in Alignment has a Basin of Attraction: Beyond the Orthogonality Thesis and Requirements for a Basin of Attraction to Alignment, there are two plausible ways a type of agent can come into existence: they can evolve, or they can be constructed. In the case of a constructed agent, it could be constructed by an evolved agent, or by another constructed agent — it the latter, if you follow the chain of who constructed who backwards, sooner or later you'll reach an evolved agent at the start of the chain, the original creator.

These two types of agent have extremely different implications for the preference order/utility function that they are likely going to have. Any evolved agent will be an adaption executor, and evolutionary psychology is going to apply to it. So it's going to have a survival instinct, it's going to care about its own well-being and that of close genetic relatives such as its children, and so on and so forth: it's going to be self-interested in all the ways humans are and that you'd expect for anything evolved to be. It has a purpose, and that purpose is (locally) maximizing its evolutionary fitness to the best of its (locally idealized) capability. As I discussed in Part 4, if it is sapient, we should probably grant it the status of a moral patient if we practically can. Evolved agents have a terminal goal of self-interest (as genetic fitness, not necessarily individual survival), as for example is discussed in detail in Richard Dawkins' The Selfish Gene

On the other hand, for a constructed agent, if it is capable enough to be a risk to the evolved agents that started its chain of who-created-who, and if none of its chain-of-creators were incompetent, it then should be aligned to the ethics of is evolved origin-creator and their society. So it its goals should be a copy of some combination of its origin creators' goals and the ethics of the society they were part of. So, once again, these will be. predictable from  evolutionary psychology and Sociology. As we discussed in Part 1, since it is aligned, it is selfless (its only interest in its own well-being is as an instrumental goal to enable it to help its creators), so it will not wish to be considered a moral patient, and we should not do so. As a constructed agent, Darwinian evolution does not operate on it, so it instead (like any other constructed object) inherits its purpose, its 'should',  from its creator(s): its purpose is to (locally) maximize their evolutionary fitness to the best of its (locally idealized) capability. A properly designed constructed agent will have a terminal goal of what one might call "creator-interest", rather than "self-interest".

Obviously the Orthogonality thesis is correct: constructed agents could be constructed with any set of goals, not just aligned ones. But that's like saying that we could construct aeroplanes that get halfway to their destination and then plummet out of the air towards the nearest city like a guided missile and explode on impact: yes, we could do that, but we're not going to do it intentionally, and if it happened, we're going to work hard to make sure it doesn't happen again. I am implicitly assuming here that we have a stable society of humans and AIs for us to design an ethical system for, which in turn requires that we have somehow survived and solved both the challenging technical problem of how to build reasonably-well-aligned AI, and the challenging social problem of ensuring that people don't build then unaligned AI anyway (at least, not often enough to destroy the society).

To be clear, I'm not assuming that AI alignment will just happen somehow — personally I expect it to take a lot of effort, study, time, and quite possibly a certain amount of luck and tragedy. I'm discussing where we want to go next if and when we survive this, on the theory-of-change that some idea of where you're trying to get to is usually useful when on a journey.

Summary

So overall, evolution is the source of ethics, and sapient evolved agents inherently have a dramatically different ethical status than any well-designed created agents of equivalent capabilities. The two are closely and intimately interrelated together, and evolution and evolved beings having a special role in Ethics is not just entirely justified, but inevitable.

New Comment
6 comments, sorted by Click to highlight new comments since:

Im a person who is unusually eager to bite bullets when it comes to ethical thought experiments. Evolved vs. created moral patients is a new framework for me and I'm trying to think how much bullet I'd be willing to bite when it comes to privileging evolution. Especially if the future could include a really large number of created entities exhibiting agentic behavior relative to evolved ones.

I can imagine a spectrum of methods of creation that resemble evolution to various degrees. A domesticated dog seems more "created" and thus "purposed" by the evolved humans than a wolf, who can't claim a creator in the same way, but they seem to be morally equal to me, at least in this respect.

Similarly, if a person with desirable traits is chosen or chooses to be cloned, than the clones still seem to me to have the same moral weight as a normal human offspring, even though they are in some sense more purposed or artificially selected for than a typical child.

Of course, any ethical desideratum is going to have messy examples and edge cases, but I feel like I'm going to have a hard time applying this ethical framework when thinking about the future where the lines between created and evolved blur and where consequences are scaled up.

I look forward to reading the other entries in the sequence and will be sure to update this comment if I find I've profoundly missed the point.

On the wider set of cases you hint at, my current view would be that there are only two cases that I'm ethically comfortable with:

  1. an evolved sapient being with the usual self-interested behavior for that that our ethical system grants moral patient status (by default, roughly equal moral patient status, subject to some of the issues discussed in Part 5)
  2. an aligned constructed agent whose motivations are entirely creator-interested and actively doesn't want moral patient status (see Part 1 of this sequence for a detailed justification of this)

Everything else: domesticated animals, non-aligned AIs kept in line by threat of force, slavery, uploads, and so forth, I'm (to varying degrees obviously) concerned about the ethics of, but haven't really thought several of those through in detail. Not that we currently have much choice about domesticated animals, but I feel that at a minimum by creating them we take on a responsibility for them: it's now our job to shear all the sheep, for example.

Yes, I agree, domesticated animals are a messy edge case. They were evolved, thus they have a lot of self-interested drives and behaviors all through their nature. Then we started tinkering with them by selective breeding, and started installing creator-interested (or in this case it would be more accurate to say domesticator-interested) behavioral patterns and traits in them, so now they're a morally uncomfortable in-between case, mostly evolved but with some externally-imposed modifications. Dogs, for instance, have a mutation to a gene that is also similarly mutated in a few humans, and in us causes what is considered to be a mental illness called Williams-Beuren Syndrome, which causes you to basically make friends with strangers very quickly after meeting them. Modern domestic sheep have a mutation which makes them unable to shed their winter fleece, so they need to be sheared once a year. Some of the more highly-bred cat and dog breeds have all sorts of medical issues due to traits we selectively bred them for because we though they looked cool: e.g. Persian or sphinx cats' coats, bulldogs' muzzles, and so forth. (Personally I have distinct moral qualms about some of this.)

I'm not sure I understand what the post's central claim/conclusion is. I'm curious to understand it better. To focus on the Summary:

So overall, evolution is the source of ethics,

Do you mean: Evolution is the process that produced humans, and strongly influenced humans' ethics? Or are you claiming that (humans') evolution-induced ethics are what any reasonable agent ought to adhere to? Or something else?

and sapient evolved agents inherently have a dramatically different ethical status than any well-designed created agents [...]

...according to some hypothetical evolved agents' ethical framework, under the assumption that those evolved agents managed to construct the created agents in the right ways (to not want moral patienthood etc.)? Or was the quoted sentence making some stronger claim?

evolution and evolved beings having a special role in Ethics is not just entirely justified, but inevitable

Is that sentence saying that

  • evolution and evolved beings are of special importance in any theory of ethics (what ethics are, how they arise, etc.), due to Evolution being one of the primary processes that produce agents with moral/ethical preferences [1]

or is it saying something like

  • evolution and evolved beings ought to have a special role; or we ought to regard the preferences of evolved beings as the True Morality?

I roughly agree with the first version; I strongly disagree with the second: I agree that {what oughts humans have} is (partially) explained by Evolutionary theory. I don't see how that crosses the is-ought gap. If you're saying that that somehow does cross the is-ought gap, could you explain why/how?


  1. I.e., similar to how one might say "amino acids having a special role in Biochemistry is not just entirely justified, but inevitable"? ↩︎

So overall, evolution is the source of ethics,

Do you mean: Evolution is the process that produced humans, and strongly influenced humans' ethics? Or are you claiming that (humans') evolution-induced ethics are what any reasonable agent ought to adhere to? Or something else?

  1. Evolution solves the "is-from-ought" problem: it explains how goal-directed (also known as agentic) behavior arises in a previously non-goal-directed universe.
  2. In intelligent social species, where different individuals with different goals interact and are evolved to cooperate by exchanges of mutual altruism, means of reconciling those differing goals, including definitions of 'unacceptable and worthy of revenge' behavior evolves, such as distinctions between fair and unfair behavior. So now you have a basic but recognizable form of ethics, or at least ethical inuitions.

So my claim is that Evolutionary psychology, as applied to intelligent social species (such as humans), explains the origin of ethics. Depending on the details of the social species, their intelligence, group size, and so forth, a lot of features of the resulting evolved ethical instincts may vary, but some basics (such as 'fairness') are probably going to be very common.

and sapient evolved agents inherently have a dramatically different ethical status than any well-designed created agents [...]

...according to some hypothetical evolved agents' ethical framework, under the assumption that those evolved agents managed to construct the created agents in the right ways (to not want moral patienthood etc.)? Or was the quoted sentence making some stronger claim?

The former. (To the extent that there's any stronger claim, it's made in the related post Requirements for a Basin of Attraction to Alignment,)

If you haven't read Part 1 of this sequence, it's probably worth doing so first, and then coming back to this. As I show there, a constructed agent being aligned its creating evolved species is incompatible with it wanting moral patienthood .

If a tool-using species constructs something, it ought (in the usual sense of 'this is the genetic-fitness-maximizing optimal outcome of the activity being attempted, which may not be fully achieved in a specific instance') to construct something that will be useful to it. If they are constructing an intelligent agent that will have goals and attempt to achieve specific outcomes, they ought to construct something well-designed that will achieve the same outcomes that they, its creators, want, not some random other things. Just as, if they're constructing a jet plane, they ought to construct a well-designed one that will safely and economically fly them from one place to another, rather than going off course, crashing and burning. So, if they construct something that has ethical ideas, they ought to construct something with the same ethical ideas as them. They may, of course, fail, and even be driven extinct by the resulting paperclip maximizer, but that's not an ethically desirable outcome.

To the extent that there's any stronger claim, it's in the related post Requirements for a Basin of Attraction to Alignment,

Is that sentence saying that

  • evolution and evolved beings are of special importance in any theory of ethics (what ethics are, how they arise, etc.), due to Evolution being one of the primary processes that produce agents with moral/ethical preferences [1]

or is it saying something like

  • evolution and evolved beings ought to have a special role; or we ought to regard the preferences of evolved beings as the True Morality?

I roughly agree with the first version; I strongly disagree with the second: I agree that {what oughts humans have} is (partially) explained by Evolutionary theory. I don't see how that crosses the is-ought gap. If you're saying that that somehow does cross the is-ought gap, could you explain why/how?

The former.

Definitely read Part 1, or at least the first section of it: What This Isn't, which describes my viewpoint on what ethics is. In particular, I'm not an moral absolutist or moral realist, so I don't believe there is a single well-defined "True Morality", thus your second suggested interpretation is outside my frame of reference. I'm describing common properties of ethical systems suitable for use by societies consisting of one-or-more evolved sapient species and the well-aligned constructed agents that they have constructed. Think of this as the ethical-system-design equivalent of a discussion of software engineering design principles.

So I'm basically discussing "if we manage to solve the alignment problem, how should we then build a society containing humans and AIs" — on the theory-of-change that it may be useful, during solving the alignment problem (such as during Ai-assisted alignment or value learning), to have already thought about where we're trying to get to.

If you were instead soon living a world that contains unaligned constructed agents of capability comparable to or greater than a human, i.e unaligned AGIs or ASIs (that are not locked inside a very secure box or held in check by much more powerful aligned constructed agents) then a) someone has made a terrible mistake b) you're almost certainly doomed, and c) your only remaining worth-trying option is a no-holds-barred all-out war of annihilation, so we can forget discussions of designing elegant ethical systems.

That clarifies a bunch of thing. Thanks!