On the abolition of man

Cool, someone is finally doing the "metaethical policing" that I talked about in this post. :)

you might end up violating obligations re: the sorts of resources, rights, welfare, and so on you need to give to agents you create, even conditional on them being happy-to-exist overall;

Concern about this must be motivating you to want to give power/resources to AIs-with-different-values (as you talked about in the last post), but I'm having trouble understanding your apparent degree of concern. My intuition is that (conditional on them being happy-to-exist overall) it's pretty unlikely to be a big moral catastrophe if we did have such obligations and temporarily violate them until we definitively solved moral philosophy, compared to if we shouldn't give power/resources to such AIs and did, which ends up causing astronomical waste. Maybe my credence for having such obligations in the first place is also lower than yours.

Could you talk more about this concern? Anything to explain why we might have such obligations? Why is it a big deal to even temporarily violate them? Can we not "make it up to them" later, through some sort of compensation/restitution, if it does turn out to be kind of a big deal?

(I do wish we could have a Long Reflection to thoroughly hash out these issues before creating such agents.)

[-]Wei Dai2y40

To expand on my own view, by creating an agent and making them happy to exist overall, we've already helped them relative to not creating them in the first place. There are still countless potential agents who do not even exist at all in our world and who would want to exist. Why would we have an obligation to further help the former set of agents (by giving them more resources/rights/welfare), and not the latter (by bringing them into existence)? That would seem rather unfair to the latter.

But if we did have an obligation to help the latter, where does that obligation stop? We can obviously spend an unlimited amount of resources to bring additional agents into existence and giving them things, and there's no obvious stopping point, nor an obvious way to split resources between giving existing agent more things and bringing new agents into existence. Whatever stopping point and split we decide could turn out to be a bad mistake. Given all this, I don't think we can be blamed too much if we say "we're pretty confused about what our values and/or obligations are; let's conserve our resources and keep our options open until we're not so confused anymore."

[-]Gunnar_Zarncke2y80

I do actually think that something like "staying within morality, as opposed to 'outside' it" is crucially important as we enter the age of AGI. Not morality as in: the Objectively Authoritative Natural Law that All Cultures Have Basically Agreed On. But morality as in: the full richness and complexity of our actual norms and values.

This serves as a summary of this post. It is a Sazen, though, requiring much of the post to understand the distinction made.

[-]Daniel Kokotajlo2y62

to the sort of Yudkowskian worldview and orientation towards the future that I've been discussi

I don't exactly disagree, but fun fact: I read the book in 2019 I think, because I went to MIRI Summer Fellows Program and there Anna Salamon recommended it to me.

[-]AnthonyC2y40

"even as you chose to create Bob, you chose to create the parts of Bob that his freedom is made of – his motivations, his reasoning, and so on. You chose for a particular sort of free being to join you in the world – one that will, in fact, choose the way you want them to"

This is one observation that Lewis, I think, would not endorse. After all, he is a Christian apologist, yet very clearly does not thereby consider God a tyrant whose complete control over the creation of the universe takes away human freedom. Calling the moral realist thing Tao instead doesn't actually help with that, I think? Either the Tao can influence the world in the present, in which case the conditioners can never really prevent it from reasserting itself; or it can't, in which case how did we first find it anyway; or it controlled the beginning as first cause in which case whatever happens anywhere ever is what it intended; or it intended something different but it's not very good at it's job. In that last case we can either let it fail or else choose for ourselves what we think we should do to help it out (which either way puts us right back in the situation of being conditioners, unable to be sure where the line between it's will and our will lies in steering the future).

[-]Viliam2y30

Christianity has a paradox in its heart, that an all-knowing and all-capable God created everything (directly or indirectly), and yet He is responsible only for the good parts of the creation.

The standard excuse is that the possibility to ruin everything was a necessary cost of our freedom, which doesn't make much sense, because (1) an all-knowing God could predict which humans would sin and which would not, and could create only the ones who would not sin, and (2) somehow it does not oppose the divine plan that human freedom is limited by thousand other things anyway, such as other humans, sickness, mortality, limited resources.

Trying to use this incoherent response as a lesson how we should shape the future... I guess we should give the future humans a random number generator, and tell them that for everything good that happens, they should thank us, and for everything bad that happens, they should blame the random number generator?

And perhaps, from a religious perspective, this even makes some sense, because we are keeping the door open for God to intervene by influencing the random number generator? The ultimate sin would be to make all the choices ourselves and not give the God an opportunity to intervene with plausible deniability?

Less charitably, the thing Lewis is optimizing for is not creating the best possible future, but avoiding blame.

[-]xpym2y30

The standard excuse is that the possibility to ruin everything was a necessary cost of our freedom, which doesn’t make much sense

There's one further objection to this, to which I've never seen a theist responding.

Suppose it's true that freedom is important enough to justify the existence of evil. What's up with heaven then? Either there's no evil there and therefore no freedom (which is still somehow fine, but if so, why the non-heaven rigmarole then?), or both are there and the whole concept is incoherent.

[-]AnthonyC2y20

Fundamentally I agree, and I think it sounds like we both agree with Spock. Christianity tries to get around this by distinguishing between timeless/eternal and within-time/everlasting viewpoints, among other approaches, but I think very much fails to make a good case. I do think there are a few plausible counterarguments here, none of which are standard AFAIK.

One is Scott Alexander's Answer to Job, basically that we're mistaken to think this is the best possible world (assuming "world" means "Earth"), because God actually created all possible net-good universes, and (due to something like entropy) most of those are going to be just-barely-net-good. That post combines it with a discussion of what the words "create" and "exist" actually might mean, in terms of identity, value, quantity, simulation, computation and how to sum utilities.

Another is dkirmani's answer below, that for some functions and initial conditions there might not be a well-defined analytical solution to the problem of future-prediction, only a computational solution, such that even God has to simulate the whole process to do the prediction or the goodness-summation (which might be equivalent to creating minds and experiences and worlds). This one is also a plausible solution to the question of why God would create anything at all.

[-]dkirmani2y20

an all-knowing God could predict which humans would sin and which would not

And how would God predict (with perfect fidelity) what humans would do without simulating them flawlessly? A truly flawless physical simulation has no less moral weight than "reality" -- indeed, the religious argument could very well be that our world exists as a figment of this God's imagination.

[+][comment deleted]2y20

[-]Richard_Kennaway2y20

The Tao (according to Lewis) is objectively true, like mathematics. Does mathematics “influence” the world? It, and the Tao, are how the world works.

[-]TAG2y20

Either the Tao can influence the world in the present, in which case the conditioners can never *really *prevent it from reasserting itself; or it can’t, in which case how did we first find it anyway; or it controlled the beginning as first cause in which case whatever happens anywhere ever is what it intended; or it intended something different but it’s not very good at it’s job.

Or it influences the world in proportion to how much it is recognised, and how much you influence the world is proportional to how much you recognise it. The Tao that controls you is not the Tao: the Tao you control is not the Tao. The Tao that does everything is not the Tao; the Tao that does nothing is not the Tao.

[-]Lukas Finnveden2y40

Now here's Bob. He's been created-by-Joe, and given this wonderful machine, and this choice. And let's be clear: he's going to choose joy. I pre-ordained it. So is he a slave? No. Bob is as free as any of us. The fact that the causal history of his existence, and his values, includes not just "Nature," but also the intentional choices of other agents to create an agent-like-him, makes no difference to his freedom. It's all Nature, after all.

Here's an alternative perspective that looks like a plausible contender to me.

If Bob identifies with his algorithm rather than with physics (c.f. this exchange on decision theory), and he's faced with the choice between paperclips and joy, then you could distinguish between cases where:

Bob was selected to be in charge of that choice by a process that would only pick an algorithm if it was going to choose joy.
Bob was selected to be in charge of that choice by a process that's indifferent to the output that the selected algorithm makes.

(In order to make sure that the chooser always has an option to pick an algorithm that chooses joy, let's extend your thought experiment so that the creator has millions of options — not just Alice and Bob.)

In the former case, I think you could say that Bob can't change whether X or Y gets chosen. (Because if Bob were to choose paperclips, then he would never have received the choice in the first place.) Notably, though, Bob can affect whether he gets physically instantiated and put in charge of the decision between joy and paperclips. (By choosing joy, and thereby making himself available as a candidate.)

On this perspective, the relevant difference wouldn't be "created by nature" vs. "created by agents". Nature could (in principle) create someone via a process that exerts extremely strong selection pressure on that agent's choice in a particular dilemma, thereby eliminating that agent's own freedom to choose its output, there. And conversely, an agent could choose who to create based on some qualitites other than what they'd choose in a particular dilemma — leaving their created agent free to decide on that dilemma, on their own.

[-]tailcalled2y20

I think this is especially clear with respect to our influence on what sort of future people will exist. Thus, consider again the example I discussed in earlier essay, of a boulder rolling towards a button that will create a Alice, paperclip-maximizer, but which can be diverted towards a button that will create Bob, who loves joy and beauty and niceness and so on, instead (and who loves life, as well, to a degree that makes him very much want to get-created if anyone has the chance to create him). Suppose that you choose to divert the boulder and create Bob instead of Alice. And suppose that you do so even without believing that an objectively-authoritative Tao endorses and legitimizes your choice.
Are you a tyrant? Have you "enslaved" Bob? I think Lewis's stated view answers yes, here, and that this is wrong. In particular: a thing you didn't do, here, is break into Alice's house while she was sleeping, and alter her brain to make her care about joy/beauty/niceness rather than paperclips.^[16] Nor have you kept Bob in any chains, or as any prisoner following any triumphal car.

I object to this thought experiment on the same basis as the problem with the GLUT; "Bob, who loves joy and beauty and niceness and so on" is a high-information concept who would not have appeared by chance. Some process had to make Bob's details, and the tyranny/slavery/poultry-keeping could be attributed to this process, rather than to you who merely diverted a boulder and only contributed 1 bit of information.

[-]Julian Bradshaw2y*21

and regardless, CEV merely re-allocates influence to the arbitrary natural preferences of the present generation of humans

I thought CEV was meant to cover the (idealized, extrapolated) preferences of all living humans in perpetuity. In other words, it would include future generations as they were born, and would also update if the wisdom of the current generation grew. (or less charitably, if its moral fashions changed)

I do recognize that classical CEV being speciesist in favor of Humans is probably its central flaw (forget about hypothetical sentient AIs and friendly aliens, what about animals?), but I think it might at least be self-modifying on this front as well? For example, if we ran into some friendly Star Trek aliens, and we wanted to have them join humanity as equals in a new Federation, our CEV would then become "also include these guys as sources of the CEV", and thus they would be.

I'm not sure if a CEV-as-learned-by-AI would necessarily be flexible enough to make those aliens permanent voting shareholders of the CEV, such that if humanity later regretted their decision to include these aliens they wouldn't suddenly get removed from the CEV, but it at least seems plausible?

(Anyway I'm really liking this series, thanks for writing it!)

[-]DanielFilan2y20

The introduction here reminds me of the paper "Semantics for Blasphemy" by Meghan Sullivan, which I've heard summarized thusly:

You might be confused why blasphemy laws exist. Here's the deal: God's really important, and it's important for us to be able to talk about God, and to pray to him using his name. But if we allow blasphemy, the word "God" might end up referring to something other than the actual God - sort of like how the proliferation of Santa myths has meant that "Santa Claus" no longer refers to Bishop Nicholas of Myra. This would be a disaster because we wouldn't be able to talk about God, or pray to him, or reflect on how we should behave in light of what God is like.

[-]Timothy Johnson2y10

when we're talking about our influence on future generations, we're almost always talking about the former, Bob-instead-of-Alice, type case.

I admit I haven't read Parfit yet, but can you give a concrete example of what type of influence you mean here?

I think Lewis would disagree with this claim, or at least that type of influence is not what he has in mind. The example that he uses at the beginning of Abolition of Man is about a particular school textbook, and public education is the prototypical example of "changing a particular person's values."

[-]andrew sauer2y10

I had thought something similar when reading that book. The part about the "conditioners" is the oldest description of a singleton achieving value lock-in that I'm aware of.

[-]Slimepriestess2y-11

What does that look like with respect to shaping-the-values-of-others? I won't, here, attempt a remotely complete answer

in very short, if you sub in the "agency of all agents" itself as the "value to be maximized" the repugnancy vanishes from utilitarianism and it gets a lot closer to what it seems like you're searching/advocating for.

See also Lewis's "Space Trilogy" – and especially the third book, That Hideous Strength – for fiction that makes many of the same points. ↩︎
Lewis is a Christian, and much of his work is aimed, in one form or another, at convincing readers of Christianity. But he claims that he is not attempting any direct argument for theism in the Abolition of Man; and I do think the issues he raises have resonance well beyond religious contexts, and enough to make them worth addressing on their own terms. ↩︎
"This thing which I have called for convenience the Tao, and which others may call Natural Law or Traditional Morality or the First Principles of Practical Reason or the First Platitudes, is not one among a series of possible systems of value. It is the sole source of all value judgements. If it is rejected, all value is rejected. If any value is retained, it is retained. The effort to refute it and raise a new system of value in its place is self-contradictory. There has never been, and never will be, a radically new judgement of value in the history of the world." ↩︎
"Those who understand the spirit of the Tao and who have been led by that spirit can modify it in directions which that spirit itself demands. Only they can know what those directions are. The outsider knows nothing about the matter. His attempts at alteration, as we have seen, contradict themselves. So far from being able to harmonize discrepancies in its letter by penetration to its spirit, he merely snatches at some one precept, on which the accidents of time and place happen to have riveted his attention, and then rides it to death—for no reason that he can give. From within the Tao itself comes the only authority to modify the Tao." ↩︎
"When the age for reflective thought comes, the pupil who has been thus trained in 'ordinate affections' or 'just sentiments' will easily find the first principles in Ethics; but to the corrupt man they will never be visible at all and he can make no progress in that science. Plato before him had said the same. The little human animal will not at first have the right responses. It must be trained to feel pleasure, liking, disgust, and hatred at those things which really are pleasant, likeable, disgusting and hateful." ↩︎
Here, as elsewhere in the book, Lewis is somewhat sloppy. Notably, for example, he seems to think of selling new services to other humans (e.g., selling people access to airplanes or telephones) as exercising power over them in a manner comparable to the sort of exercise of power at stake in violence, coercion, or manipulation (e.g., bombing them using an airplane, or manipulating them using propaganda). I think this sort of conflation misses important subtleties: not all influence is oppression (for example – and modulo various controversial cases, e.g. organ sales – if I simply give you more options that I expect you to choose between rationally), and power for one human need not come at the expense of power for another (for example, if the total amount of power has increased). Still, though, Lewis's basic point seems broadly correct: new tools often open up new ways some humans can dominate and oppress others. ↩︎
I think it's core, for example, to the basic intuition behind the "orthogonality thesis" – though not, perhaps, strictly necessary for accepting such a thesis. ↩︎
Thanks to Carl Shulman for emphasizing this point. ↩︎
This is what RLHF is about, right? ↩︎
Though as I discuss here, I don't actually think "subjectivism vs. realism" is clearly the key thing here. In particular: positing an objective morality doesn't clearly help. ↩︎
See my discussion of the "mystery view" here for a bit more on this. ↩︎
Lewis claims, elsewhere, to side with the scouts about arguments. But seek for them in his writing regardless, and ye shall find. ↩︎
Lewis generally has a penchant for argument-via-unsubtle-laying-out-of-the-options – e.g., his argument in Mere Christianity that Jesus was either a liar, or a lunatic, or the Lord (and does Jesus seem like a liar? Does he seem crazy? There's only one option left...). ↩︎
Indeed, to the extent your child has "values," they seem focused on, you know, the basics: crying, eating, playing, pooping. Indeed, if you tried to reify these "basics" into a set of endorsed values – for example, by "uplifting" the baby directly into superintelligence without first allowing it to "grow up" – then you risk creating a monstrosity: some grotesque and galaxy-brained extrapolation of play-time, need-for-mother, want-to-poop, want-the-toy. Thanks to Carl Shulman and Nick Beckstead for discussion, here. That said, I don't think I'd want other beings saying this sort of thing about me (e.g., the paperclippers saying "look at him, he's such a child, don't take his current 'values' seriously, let's raise him to love paperclips instead"). But I'm optimistic about finding some viable middle ground. ↩︎
Though note that intentionality does make a difference to tyranny-intuitions – e.g., there's a big difference between accidentally shaping someone's values, and intentionally doing so, for how much you seem-like-a-tyrant. ↩︎
See Soares here for a similar point. ↩︎
In particular: the distinction seeks to treat already-existing people as very different from potential-people, and death – e.g., changing Alice into Bob – as very different from non-creation – e.g., creating Bob instead of Alice. But as Parfit also taught us, building your ethics around distinctions like this can be rough going. ↩︎
And I know, too, that Bob will be very happy to have been created regardless. ↩︎
Let's say you live in a simulation or something – work with me. ↩︎
Though at its best, it's the type of compatibilism that doesn't even get hung up on whether the universe is ultimately deterministic or not – the introduction of fundamental randomness doesn't make a difference. ↩︎
"The regenerate science which I have in mind ... would not be free with the words only and merely. In a word, it would conquer Nature without being at the same time conquered by her and buy knowledge at a lower cost than that of life." ↩︎
Indeed, in this respect, Yudkowsky and Feynman, for all the depth of their atheism, seem to me more attuned to the type of spirituality Lewis claims, in other contexts, to endorse – namely, that type that aspires to meet the Real, fully, on its own terms; to look God, whoever He is, in the eye. Whereas Lewis seems more worried that without some objectively authoritative Tao, the real world isn't enough. ↩︎
Strictly, even this isn't quite right: really, it's our present love of joy/beauty/flourishing, judging between two possible future types of love. ↩︎
Lewis is especially sloppy on the question of whether the ability to re-define a word, going forward, means that the word no longer has meaning for you now. If I can re-define "dog" to refer to cats instead, still, I can talk sensibly about dogs, now. It's like that old joke: "if you call a tail a leg, how many legs does a dog have?" We can dispute whether actually calling a tail a leg makes it a leg. But surely, being able to a call a tail a leg, going forward, doesn't make it a leg now. ↩︎
This is closely related to the sense in which utilitarianism can't distinguish very well between killing someone and failing-to-create-them. ↩︎
See e.g. Wei Dai's comment here. ↩︎
And of course, we can also think about other non-human cases: e.g. training pets, breeding animals, and so on. ↩︎
Though we can, perhaps, subsume this under some combination of "interactions I consent to" and "interactions where I expect whatever values-changes-to-result to be 'endorsed' according to my current perspective." ↩︎
Albeit, with all the standard caveats about translating thought-experimental results into real-world practice. ↩︎
Feel free to make the Nazi a reflectively-coherent-killing-children-maximizer if you'd prefer. ↩︎
Indeed, it seems like you should choose the dart over the bullets. ↩︎
Though as I've noted previously, the fact that we were the ones who created the aggressors complicates the moral narrative here yet further. ↩︎
Of course, we do, still, have normal children – body-children, as it were. And many of these issues arise with respect to body-children, to – and more-so as we become more able to choose the traits of our body-children, including the traits relevant to values/virtue (e.g. empathy, patience, conscientiousness, bravery, integrity, etc), ahead of time. But at least with body-children, we have established canons of ethical practice to fall back on. AI mind-children implicate much more uncharted territory. ↩︎
See e.g. here and here. Here I am inspired by the discussion, in the work of Ord and MacAskill, of the "Long Reflection" – though obviously, it's a further question what sorts of wisdom and reflection to expect or aim for in practice. ↩︎
Also, wouldn't this principle also lead them to say the same about you? ↩︎
Consequentialists often pass over this consideration, on the grounds that the good or badness of a situation for someone seems independent of whether that situation was caused "naturally" or at the hand of some other agent (e.g., malaria is equally bad for a child when it arose naturally or as a result of injustice). But richer ethical views often care quite a bit. ↩︎
I haven't finished the essay yet, and I'm wondering about splitting it into two parts. ↩︎

LESSWRONG
LW

LESSWRONG
LW

91

On the abolition of man

91

91

Tyrants and poultry-keepers

Are we the conditioners?

Lewis's argument in a moral realist world

What if the Tao isn't a thing, though?

Even without the Tao, shaping the future's values need not be tyranny

Freedom in a naturalistic world

Does treating values as natural make them natural?

Naturalists who still value stuff

What should the conditioners actually do, though?

On not-brain-washing

On influencing the values of not-yet-existing agents