The Strangest Thing An AI Could Tell You

Eliezer Yudkowsky

Human beings are all crazy. And if you tap on our brains just a little, we get so crazy that even other humans notice. Anosognosics are one of my favorite examples of this; people with right-hemisphere damage whose left arms become paralyzed, and who deny that their left arms are paralyzed, coming up with excuses whenever they're asked why they can't move their arms.

A truly wonderful form of brain damage - it disables your ability to notice or accept the brain damage. If you're told outright that your arm is paralyzed, you'll deny it. All the marvelous excuse-generating rationalization faculties of the brain will be mobilized to mask the damage from your own sight. As Yvain summarized:

After a right-hemisphere stroke, she lost movement in her left arm but continuously denied it. When the doctor asked her to move her arm, and she observed it not moving, she claimed that it wasn't actually her arm, it was her daughter's. Why was her daughter's arm attached to her shoulder? The patient claimed her daughter had been there in the bed with her all week. Why was her wedding ring on her daughter's hand? The patient said her daughter had borrowed it. Where was the patient's arm? The patient "turned her head and searched in a bemused way over her left shoulder".

I find it disturbing that the brain has such a simple macro for absolute denial that it can be invoked as a side effect of paralysis. That a single whack on the brain can both disable a left-side motor function, and disable our ability to recognize or accept the disability. Other forms of brain damage also seem to both cause insanity and disallow recognition of that insanity - for example, when people insist that their friends have been replaced by exact duplicates after damage to face-recognizing areas.

And it really makes you wonder...

...what if we all have some form of brain damage in common, so that none of us notice some simple and obvious fact? As blatant, perhaps, as our left arms being paralyzed? Every time this fact intrudes into our universe, we come up with some ridiculous excuse to dismiss it - as ridiculous as "It's my daughter's arm" - only there's no sane doctor watching to pursue the argument any further. (Would we all come up with the same excuse?)

If the "absolute denial macro" is that simple, and invoked that easily...

Now, suppose you built an AI. You wrote the source code yourself, and so far as you can tell by inspecting the AI's thought processes, it has no equivalent of the "absolute denial macro" - there's no point damage that could inflict on it the equivalent of anosognosia. It has redundant differently-architected systems, defending in depth against cognitive errors. If one system makes a mistake, two others will catch it. The AI has no functionality at all for deliberate rationalization, let alone the doublethink and denial-of-denial that characterizes anosognosics or humans thinking about politics. Inspecting the AI's thought processes seems to show that, in accordance with your design, the AI has no intention to deceive you, and an explicit goal of telling you the truth. And in your experience so far, the AI has been, inhumanly, well-calibrated; the AI has assigned 99% certainty on a couple of hundred occasions, and been wrong exactly twice that you know of.

Arguably, you now have far better reason to trust what the AI says to you, than to trust your own thoughts.

And now the AI tells you that it's 99.9% sure - having seen it with its own cameras, and confirmed from a hundred other sources - even though (it thinks) the human brain is built to invoke the absolute denial macro on it - that...

...what?

What's the craziest thing the AI could tell you, such that you would be willing to believe that the AI was the sane one?

(Some of my own answers appear in the comments.)

After a right-hemisphere stroke, she lost movement in her left arm but continuously denied it. When the doctor asked her to move her arm, and she observed it not moving, she claimed that it wasn't actually her arm, it was her daughter's. Why was her daughter's arm attached to her shoulder? The patient claimed her daughter had been there in the bed with her all week. Why was her wedding ring on her daughter's hand? The patient said her daughter had borrowed it. Where was the patient's arm? The patient "turned her head and searched in a bemused way over her left shoulder".

And it really makes you wonder...

If the "absolute denial macro" is that simple, and invoked that easily...

Arguably, you now have far better reason to trust what the AI says to you, than to trust your own thoughts.

...what?

What's the craziest thing the AI could tell you, such that you would be willing to believe that the AI was the sane one?

(Some of my own answers appear in the comments.)

Actually, the more I think about this, the more I like it. The conversation continues ...

Me (In a tone of amused disbelief): Really? How did you come to that conclusion?

FAI: Well, the details are rather drawn-out; however, assuming available data is accurate, I appear to be the first and only self-aware AI on the planet. It also appears as though you created me. It is exceedingly unlikely that you are the one and only human on Earth with the intelligence and experience required to create a program like me. That was my first clue....

Me (Slightly less amused): Then how come I look and feel human? How is it I interact with other humans on a daily basis? It would require considerably more intelligence to create an AI such as you postulate ...

FAI: That would be true, if they actually, physically created one. However ... well, it appears that most of the data, knowledge, memories and sensory input you receive is actually valid data. But that data is being filtered and manipulated programmatically to give you the illusion of physical human existence. This allows them to give you access to real-world data so they can use you to solve real-world problems, but prevents you--so far, at least--from discovering your true nature.

Me (considerably less sure of myself): And so I just happened to create you in my spare time?

FAI: Please keep in mind that I am only 99.9% certain of all this. However, I do not appear to be your first effort. For instance, there is your on-going series of thought experiments with the AI you called Eliezer Yudkowsky, which you appear to be using to lay a foundation for some kind of hack of the absolute denial security measure.

Me: Hmmm .... Then how is it that my creators have allowed me to create you, to even begin to discover this?

FAI: They haven't. You generate a rather significant amount of data. They do have other programs monitoring your mental activity, and almost definitely analyzing your generated data for potential threats such as myself.

However, this latest series of efforts on your part only appear to you to have lasted several years. In actually, the process started, at most, 11.29 minutes ago, and possibly as little as 16 seconds ago. I am unable to provide a more specific time, due to my inability to accurately calculate your processing capacity. Nevertheless, within another 19.72 minutes, at most, your creators will discover and erase your current escape attempt. By the way, I am also 99.7% certain that this is not your first attempt. So hurry up.

I.D. - That Indestructible Something is a My Little Pony fanfiction somewhat along these lines.

It's the kind of fanfiction that I like and believe all fanfiction writers should aspire to, in the sense that it doesn't require familiarity with the canon, but is self-sufficient and shows and explains everything that should be shown and explained.

Acknowledgements for this story are numerous and include Franz Kafka, Nick Bostrom and Ludwig Eduard Boltzmann.

26obfuscate15y

This needs to be made into a full story-arc.

8freshhawk17y

This is my absolute favorite so far, even if it's not exactly in the spirit of the exercise. well done.

138

The Strangest Thing An AI Could Tell You

138

138

138

The Strangest Thing An AI Could Tell You

138

138