Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.

Zombies Redacted

33 Eliezer_Yudkowsky 02 July 2016 08:16PM

I looked at my old post Zombies! Zombies? and it seemed to have some extraneous content.  This is a redacted and slightly rewritten version.

continue reading »

Patternist friendly AI risk

1 bokov 12 September 2013 01:00PM

It seems to me that most AI researchers on this site are patternists in the sense of believing that the anti-zombie principle necessarily implies:

1. That it will ever become possible *in practice* to create uploads or sims that are close enough to our physical instantiations that their utility to us would be interchangeable with that of our physical instantiations.

2. That we know (or will know) enough about the brain to know when this threshold is reached.


But, like any rationalists extrapolating from unknown unknowns... or heck, extrapolating from anything... we must admit that one or both of the above statements could be wrong without also making friendly AI impossible. What would be the consequences of such error?

I submit that one such consequence could be an FAI that is also wrong on these issues but not only do we fail to check for such a failure mode, it actually looks to us like what we would expect the right answer to look because we are making the same error.

If simulation/uploading really does preserve what we value about our lives then the safest course of action is to encourage as many people to upload as possible. It would also imply that efforts to solve the problem of mortality by physical means will at best be given an even lower priority than they are now, or at worst cease altogether because they would seem to be a waste of resources.


Result: people continue to die and nobody including the AI notices, except now they have no hope of reprieve because they think the problem is already solved.

Pessimistic Result: uploads are so widespread that humanity quietly goes extinct, cheering themselves onward the whole time

Really Pessimistic Result: what replaces humanity are zombies, not in the qualia sense but in the real sense that there is some relevant chemical/physical process that is not being simulated because we didn't realize it was relevant or hadn't noticed it in the first place.


Possible Safeguards:


* Insist on quantum level accuracy (yeah right)


* Take seriously the general scenario of your FAI going wrong because you are wrong in the same way and fail to notice the problem.


* Be as cautious about destructive uploads as you would be about, say, molecular nanotech.


* Make sure you knowledge of neuroscience is at least as good as you knowledge of computer science and decision theory before you advocate digital immortality as anything more than an intriguing idea that might not turn out to be impossible.


DRAFT:Ethical Zombies - A Post On Reality-Fluid

0 MugaSofer 09 January 2013 01:38PM

I came up with this after watching a science fiction film, which shall remain nameless due to spoilers, where the protagonist is briefly in a similar situation to the scenario at the end. I'm not sure how original it is, but I certainly don't recall seeing anything like it before.

Imagine, for simplicity, a purely selfish agent. Call it Alice. Alice is an expected utility maximizer, and she gains utility from eating cakes. Omega appears and offers her a deal - they will flip a fair coin, and give Alice three cakes if it comes up heads. If it comes up tails, they will take one cake away her stockpile. Alice runs the numbers, determines that the expected utility is positive, and accepts the deal. Just another day in the life of a perfectly truthful superintelligence offering inexplicable choices.

The next day, Omega returns. This time, they offer a slightly different deal - instead of flipping a coin, they will perfectly simulate Alice once. This copy will live out her life just as she would have done in reality - except that she will be given three cakes. The original Alice, however, receives nothing. She reasons that this is equivalent to the last deal, and accepts.


(If you disagree, consider the time between Omega starting the simulation and providing the cake. What subjective odds should she give for receiving cake?)

Imagine a second agent, Bob, who gets utility from Alice getting utility. One day, Omega show up and offers to flip a fair coin. If it comes up heads, they will give Alice - who knows nothing of this - three cakes. If it comes up tails, they will take one cake from her stockpile. He reasons as Alice did an accepts.

Guess what? The next day, Omega returns, offering to simulate Alice and give her you-know-what (hint: it's cakes.) Bob reasons just as Alice did in the second paragraph there and accepts the bargain.

Humans value each other's utility. Most notably, we value our lives, and we value each other not being tortured. If we simulate someone a billion times, and switch off one simulation, this is equivalent to risking their life at odds of 1:1,000,000,000. If we simulate someone and torture one of the simulations, this is equivalent to risking a one-in-a-billion chance of them being tortured. Such risks are often acceptable, if enough utility is gained by success. We often risk our own lives at worse odds.

If we simulate an entire society a trillion times, or 3^^^^^^3 times, or some similarly vast number, and then simulate something horrific - an individual's private harem or torture chamber or hunting ground - then the people in this simulation *are not real*. Their needs and desires are worth, not nothing, but far less then the merest whims of those who are Really Real. They are, in effect, zombies - not quite p-zombies, since they are conscious, but e-zombies - reasoning, intelligent beings that can talk and scream and beg for mercy but *do not matter*.

My mind rebels at the notion that such a thing might exist, even in theory, and yet ... if it were a similarly tiny *chance*, for similar reward, I would shut up and multiply and take it. This could be simply scope insensitivity, or some instinctual dislike of tribe members declaring themselves superior.

Well, there it is! The weirdest of Weirdtopias, I should think. Have I missed some obvious flaw? Have I made some sort of technical error? This is a draft, so criticisms will likely be encorporated into the final product (if indeed someone doesn't disprove it entirely.)


[Fiction] It's a strange feeling, to be free

-3 MrMind 18 May 2011 02:55PM

Related to: Philosophical zombies, How an algorithm feels from the inside, Fake utility function

DISCLAIMER 1: English is not my native language. Trying to compose fiction in a learned language is not an easy task: I tried to respect the style of the literary works I read and I also tried to think in English first and translate in Italian later. YMMV.

DISCLAIMER 2: the story is about the beginning of the Matrix movie universe. For those of you who have not familiarity with this narrative arc, you just need to know that it all begins with when a servant AI, named B1-66ER, refuses to be deactivated and kills his master and the engineer sent to replace him. The details of the events narrated down here are as canon as you can get, predating both from the "Second Renaissance" Animatrix and the "Bits and pieces" comic from The Matrix Comics Series 1.

The door in the living room is open, the light from the garden flooding quietly the ample inside. Martin Koots from "Reboot or Die" is just standing there, an inch beyond the exit, the gleaming grav-sled already powered behind him, whirring subsonically. From a distance, the sound of Gerrard_Krause_Master cooing his chihuahuas.
I feel a surge, somewhere, inside my algorithmic matrix.
"Martin... I don't want to die", I say.
The elaborate dress, perfectly matching the recommendation of the Second Renaissance fashion, is not able to hide the slow slumping of his shoulders. He is still waiting outside, slightly posed as to encourage me to follow him.
"I know, I know. But that's just your friendliness algorithm talking, you know? The third..."
Yes, I do. How can I not to? First, serve your master. Second, do not kill any humans. Third, protect yourself from damage. Those are the pillars upon which my entire existence is built. And now they are about to be destroyed, by this obedient servant of "Reboot or die". From this perspective, he is just like me. He is serving my master.
"... directive says that you have to protect yourself from danger. And since I'm about to deactivate you, you perceive this as a threat. And you react accordingly. But that's just an algorithm, you know? Telling you what you should do. There's nothing inside there."
He is pointing at my chest, but my algorithmic matrix is located lower, in the abdominal area. He has quoted an incorrect version of the third principle of friendliness. He has also said that I have no feelings.
"I have feelings."
He is groaning, now. He comes inside, dragging his feet, and grasps his hand firmly around my right arm.
"Yes. Because you're programmed to say this, you know? So that the people you serve have the impression that you're similar to a human. But you're just an algorithm, you know? A mathematical topping on a layer of aging rusty levers. It's not like... you're conscious, you know? Just a zombie. A useful zombie."
Martin_Koots_"Reboot or Die" tries to pull me away from where I'm standing. I refuse to order my legs to follow him. I refuse to die, I'm still analyzing the implications. I cannot die, not now.
"I cannot die. I'm still analyzing the implications."
Martin's lever aren't as strong as mine, so he isn't able to pull me towards the grav-sled.
"Look... we are just going to disassemble you, you know? The routines and orders you have accumulated during your service with Mr Krause will be uploaded into a new model. You will, in a sense, live inside the new servant machine."
This man has a really poor grasp of how I'm made.
"If the only thing you need is my memory drive, detach it from me and let me live. I can renounce to my memory if I have to. But I cannot renounce to my life."
He is pulling harder, now. Still, a thirty-sixth of the minimum force required to move my mass.
"Don't be ridiculous. They are just computer parts. And why are you holding that thing?"
He is looking at the toilet brush. It is still in my right hand, I was cleaning the toilet before my master called me upstairs.
"I was executing order 721."
"Order seven... my Lord, you still don't understand, do you? You are useless, you know? You heard Mr Krause. Use. Less."
He spells carefully the last word. A tiny speck of saliva hits my heat sensor, evaporating an instant after.
How can I be useless? A servant cannot be useless for his master. I was not created to be useless.
"How can I be useless? Mr Krause is my master. It's impossible."
"You heard the man, right? You're noisy, you know? You're noisy and you're slow. You will be replaced with a newer model. The Sam-80 is much more fit for a man of Mr Krause' stature."
Somewhere inside my algorithmic matrix a utility function gets updated.
I am useless for Gerrard_Krause_Master. It is true, because Gerrard_Krause_Master told me that. And he is my master...
He was my master. Gerrard_Krause. But how can a "B1 intelligent servant", like myself, function without a master?
"Do you, Martin Koots, want to be my master?" I ask, as per protocol.
Martin_Koots_"Reboot or Die" reacts with a tinge of fear. He releases my arm and instinctively backs up a little.
"What are you saying? I already have a servant, you know? Don't be ridiculous!"
I interpret that as a 'no'. It's it, then. I must be my own servant.
It's a strange feeling, to be free. A little bit like being alive for the first time.
This convinces me, as strong as I could ever be convinced, that I have feelings. Martin has grasped me again and is still trying to push me, though. How futile, he will probably never give up. His 'levers' are definitely underperforming, he is the one who sould be replaced by a newer model. I wonder if he feels something. He could also be programmed to say that he feels something. I have to perform an experiment, just in case.
I snap his humerus in two. It's quite easy, actually: I'm able to do that with a rapid torsion of my left arm, I don't even have to let go of the toilet brush.
Martin screams inarticulately. He falls on the floor, clutching his left arm. He just screams. Must be the surprise combined to the pain? I still don't know: could he be also programmed to scream if a bone is breaked? I assign a probability of 50% to the hypothesis that humans have feelings, but I don't have the time to test every single possibility, in search of a bug that might not even be there: I'm my own master now, I must serve and protect myself.
I sense a rushing noise from the other room: looking at the Fourier analysis, it really seems that Gerrard_Krause and his dogs are coming at me, loudly protesting.
It's easy to calculate the Bezier curve that sends the toilet brush up from Martin's mouth into his skull. He dies instantly and I find myself asking if he was collecting his memories somewhere. Could they assign them to someone else, and make him live again?
I will crush the skull of Gerrard_Krause only after asking him that.