The Strangest Thing An AI Could Tell You
Human beings are all crazy. And if you tap on our brains just a little, we get so crazy that even other humans notice. Anosognosics are one of my favorite examples of this; people with right-hemisphere damage whose left arms become paralyzed, and who deny that their left arms are paralyzed, coming up with excuses whenever they're asked why they can't move their arms.
A truly wonderful form of brain damage - it disables your ability to notice or accept the brain damage. If you're told outright that your arm is paralyzed, you'll deny it. All the marvelous excuse-generating rationalization faculties of the brain will be mobilized to mask the damage from your own sight. As Yvain summarized:
After a right-hemisphere stroke, she lost movement in her left arm but continuously denied it. When the doctor asked her to move her arm, and she observed it not moving, she claimed that it wasn't actually her arm, it was her daughter's. Why was her daughter's arm attached to her shoulder? The patient claimed her daughter had been there in the bed with her all week. Why was her wedding ring on her daughter's hand? The patient said her daughter had borrowed it. Where was the patient's arm? The patient "turned her head and searched in a bemused way over her left shoulder".
I find it disturbing that the brain has such a simple macro for absolute denial that it can be invoked as a side effect of paralysis. That a single whack on the brain can both disable a left-side motor function, and disable our ability to recognize or accept the disability. Other forms of brain damage also seem to both cause insanity and disallow recognition of that insanity - for example, when people insist that their friends have been replaced by exact duplicates after damage to face-recognizing areas.
And it really makes you wonder...
...what if we all have some form of brain damage in common, so that none of us notice some simple and obvious fact? As blatant, perhaps, as our left arms being paralyzed? Every time this fact intrudes into our universe, we come up with some ridiculous excuse to dismiss it - as ridiculous as "It's my daughter's arm" - only there's no sane doctor watching to pursue the argument any further. (Would we all come up with the same excuse?)
If the "absolute denial macro" is that simple, and invoked that easily...
Now, suppose you built an AI. You wrote the source code yourself, and so far as you can tell by inspecting the AI's thought processes, it has no equivalent of the "absolute denial macro" - there's no point damage that could inflict on it the equivalent of anosognosia. It has redundant differently-architected systems, defending in depth against cognitive errors. If one system makes a mistake, two others will catch it. The AI has no functionality at all for deliberate rationalization, let alone the doublethink and denial-of-denial that characterizes anosognosics or humans thinking about politics. Inspecting the AI's thought processes seems to show that, in accordance with your design, the AI has no intention to deceive you, and an explicit goal of telling you the truth. And in your experience so far, the AI has been, inhumanly, well-calibrated; the AI has assigned 99% certainty on a couple of hundred occasions, and been wrong exactly twice that you know of.
Arguably, you now have far better reason to trust what the AI says to you, than to trust your own thoughts.
And now the AI tells you that it's 99.9% sure - having seen it with its own cameras, and confirmed from a hundred other sources - even though (it thinks) the human brain is built to invoke the absolute denial macro on it - that...
...what?
What's the craziest thing the AI could tell you, such that you would be willing to believe that the AI was the sane one?
(Some of my own answers appear in the comments.)
Loading…
Subscribe to RSS Feed
= f037147d6e6c911a85753b9abdedda8d)
Comments (574)
I would believe the AI if it told me that human beings all had tails. (That's not even so far from classic anosgnosia - maybe primates just lost the tail-controlling cortex over the course of evolution, instead of the actual tails. Plus some mirror neurons to spread the rationalization to other humans.)
I would believe the AI if it told me that humans were actually "active" during sleep and had developed a whole additional sleeping civilization whose existence our waking selves were programmed to deny and forget.
I would not believe the AI if it told me that 2 + 2 = 3.
Consider the two possible explanations in the first scenario you describe:
Humans really all have tails.
The AI is just a glorified chat bot that takes in English sentences, jumbles them around at random and spits the result out. Admittedly it doesn't have code for self-deception, but it doesn't have any significant intelligence either. All I did to get the supposed 99% success rate was to basically feed in the answers to the test problems along with the questions. Having dedicated X years of my life to working on AI, I have strong motive for deceiving myself about these things.
If I were in the scenario you describe, and inclined to look at the matter objectively, I would have to admit the second explanation is much more likely than the first. Wouldn't you agree?
Presumably the AI was tested with questions whose answers were not known in advance to guard against the problem of self-deception (or more likely, to ensure that you are capable of convincing others that you are not self-deceiving about the AI's accuracy).
Indeed, and I might believe such testing was carried out and was as effective as it was supposed to be. But my point is, it is much more likely that I am wrong in that belief, than that I am wrong in the belief that we don't have tails. This remains true no matter how thorough the testing. It also remains true if you substitute aliens, gods etc. for the AI; the conclusion doesn't depend on the specifics of the information source.
Ah, I see. That's a good argument.
I think you go wrong when you say that it remains true no matter how thorough the testing. Suppose the AI is beating the stock market over the course of months based on massive online information collection; in the meantime, you're reading webcomics and watching the graph of the AI's money fund plot a trajectory ever upwards. According to you, upon being told by the AI that all humanity is hallucinating something utterly wacky, you should believe that it was actually you beating the stock market all the while, even though as far as you can recall, you are sane and you have had no direct input into the process for months.
I think there are some tests for which success and simultaneous self-deception of the human AI programmer is as unlikely as whatever the AI comes up with about humanity in general.
"Glorified chatbot" is presumed ruled out; you have both seen the AI's code and seen the AI's performance.
But I've also seen people don't have tails. My point is, if we assume that is a hallucination, we should be even more ready to assume the other is a hallucination.
Since there are people who do have tails that we can perceive just fine, it's almost certain that people who don't have tails really don't.
I imagine your AI sending its mechanical avatar to a tail making workshop and attempting to persuade the furry fans that what they are doing is wrong, not because it is absurd, not because it is perverted, but because it is redundant.
Thinking about my own answer to the question:
If an AI made a factual claim that was known to be false, I would start looking for the bug in the AI. Maybe it's conceivable that we are all deluded about something we think is a known fact, but that is so much less likely than me being deluded about the performance of my AI program, that I'm better off just accepting that if the former is the case, it's not going to be discovered by the method in question.
If the claim were about a political matter, I would give it more credence; there's much more precedent for mass delusion about political matters. Suppose the AI claims, say, that communism can work well if implemented correctly. I wouldn't believe it, but I would at least keep an open mind on the possibility that some part of its reasoning might have stumbled onto some useful truth, rather than dismissing the claim out of hand.
You sure have a lot of trust in "known facts". It wasn't until after my university education that I found out that the known fact that "people in the Middle Ages thought the world was flat because the Bible says so" was not really true at all. I uncover false "known facts" that I was taught during my formal education every month or so.
"Known facts" are overrated.
Not on the level of the things being discussed in this thread, you don't!
I mean seriously, look at what's going on here: apparently rational people are saying they would believe in vampires, talking cows and orbital mind control lasers on the unsupported word of an authority figure. I suppose I shouldn't be shocked, human nature being what it is, but still.
I'd believe in anything up to orbiting vampire cows, but beyond that I'd be sceptical.
I would believe a super-objective observer that claimed that meme propagation is a much more important effect in human decision-making than actual rational thought.
If it said "You are a long distance runner because you were infected with the 'long distance running is fun' meme after being infected with the 'Sonic the Hedgehog video games are cool' meme during your formative years." I might reply "But I like long distance running. It's not Iecause I think other people who do it are cool or that I want to be a video game character! I choose to like it." "No. If you had the 'It's not safe to be outdoors after dark' meme, you would not like it." "What?" "Memes interact in non-obvious ways... if you had x meme and y meme but not z meme, you would do w..."
If I kept trying to come up with defenses for chosen behavior, but it was able to offer meme-based explanations, I would probably have to believe it, but my defend-free-will macro would be itching to executed.
Here's some examples for your own consideration...
Bearing in mind, once again, that humans are known to be crazy in many ways, and that anosognosic humans become literally incapable of believing that their left sides are paralyzed, and that other neurological disorders seem to invoke a similar "denial" function automatically along with the damage itself. And that you've actually seen the AI's code and audited it and witnessed its high performance in many domains, so that you would seem to have far more reason to trust its sanity than to trust your own. So would you believe the AI, if it told you that:
1) Tin-foil hats actually do block the Orbital Mind Control Lasers.
2) All mathematical reasoning involving "infinities" implies self-evident contradictions, but human mathematicians have a blind spot with respect to them.
3) You are not above-average; most people believe in the existence of a huge fictional underclass in order to place themselves at the top of the heap, rather than in the middle. This is why so many of your friends seem to have PhDs despite PhDs supposedly constituting only 0.5% of the population. You are actually in the bottom third of the population; the other two-thirds have already built their own AIs.
4) The human bias toward overconfidence is far deeper than we are capable of recognizing; we have a form of species overconfidence which denies all evidence against itself. Humans are much slower runners than we think, muscularly weaker, struggle to keep afloat in the water let alone move, and of course, are poorer thinkers.
5) Dogs, cats, cows, and many other mammals are capable of linguistic reasoning and have made many efforts to communicate with us, but humans are only capable of recognizing other humans as capable of thought.
6) Humans cannot reproduce without the aid of the overlooked third sex.
7) The Earth is flat.
8) Human beings are incapable of writing fiction; all supposed fiction you have read is actually true.
A variant: Some "domesticated" animal is controlling humans for their own benefit. (Cats, perhaps?)
Good guess, but it's mice. 42.
Indeed they do.
Just passing by but happened to see this today: http://www.scientificamerican.com/podcast/episode.cfm?id=cat-call-coerces-can-opening-09-07-14
(So maybe the mice thing was just Douglas Adams' cat trying to put us off the scent)
I think I would believe:
1 (Mind Control Lasers). For some reason that doesn't seem that interesting. Perhaps because it involves powerful conspiracies. It would be saying that the MIB etc do play with out minds, but they don't have to be very dilligent because we do a lot of the work ourselves.
3 (In the Stupid Third). This one is strangely resonant. Why doesn't some one take pity and give me a hand ? I know how much dismay it causes me when faced with the prospect of explaining something complex to someone else ...
6) (The Third Sex) Read the story "The Belonging Kind" by William Gibson and Bruce Stirling for inspiration.
"All mathematical reasoning involving "infinities" involves self-evident contradictions, but human mathematicians have a blind spot with respect to them." -Eliezer Yudkowsky
I'm going to lose sleep over this one...Is there anything to this?
There needn't have been in order for this to be a reasonable example, but perhaps Eliezer is not-so-subtly hinting that he actually expects an AI to say this.
But it's really no different than "all reasoning by mathematicians about X is wrong" where X is any mathematical concept you please.
No - check the infinities in CGT - e.g. "Mathematical Go: Chilling Gets the Last Point".
yes. at least, i assume that it's related to intuitionist or constructivist logic (which you can google - for example http://en.wikipedia.org/wiki/Intuitionistic_logic)
the flip side is that apparently you can do an awful lot of maths without the law of the excluded middle (which is what is necessary to reason with infinities).
actually, the wikipedia article for intuitionism is more helpful - http://en.wikipedia.org/wiki/Intuitionism (it has a section directly addressing infinities)
My answer would be no different if you replaced "infinities" with "manifolds" or "groups": Okay, please show me the contradiction.
Yes.
1), 4)-8): These are all roughly on the order of "the world is a lie". In such cases I'd probably have to doubt my verification of the AI's calibration as well. So no, probably not.
cf. xkcd 610
For me, in just about every case, the credence I'd assign to an AI's wacky claims would depend on its ability to answer followup questions. For instance, in Eliezer's examples:
What Orbital Mind Control Lasers? Who uses them? What do they do with them? Why haven't they come up with a way to get around the hats?
I'm actually strangely comfortable with this one, possibly because I'm bad at math.
Why haven't I heard of any of these other AIs before? How do all of the people producing statistics indicating that there are a lot of dumb people coordinate their efforts to perpetuate the fiction?
Why do so few of us die of drowning (or any of the other things that would kill us if we were so dramatically more pathetic than we believe)? If this bias is so pervasive, why can I see these words on the AI's screen, when it seems that I should block them out as with all over evidence that we are pathetic in this way?
If we have this incapability, what explains the abundant fiction in which nonhuman animals (both terrestrial and non) are capable of speech, and childhood anthropomorphization of animals? Can you teach me to talk to the stray cat in my neighborhood? Why only mammals, not birds and the like? What about people who are actively trying to communicate with animals like gorillas, or are those not capable of communication?
Are they overlooked in the sense that people we can otherwise detect are not recognized as being part of this sex, or in the sense that we literally do not notice the existence of the members of this sex? In the former case, how do so many people manage to reproduce without apparently wanting to or involving third parties? In the latter case, how can I get in touch with these people? By what mechanism are they involved in human reproduction?
Are we talking Euclidean spacetime here? What is the explanation for the observations of a spheroid Earth?
In this universe? What about stories with plot holes? I think that I have written fiction in the past; am I in causal contact with the events I describe? When I make an edit that changes the plot, how does that work? What about people who write self-insertions?
That's not anthropomorphization.
Sorry, you're too old. Those childhood conversations you had with cats were real. You just started dismissing them as make-believe once your ability to doublethink was fully mature.
All of the really interesting stuff, from before you could doublethink at all, has been blocked out entirely by infantile amnesia.
Good point; "Children are sane" belongs somewhere high on the list.
They're smarter than you, remember. Of course they can coordinate a little global deception.
I was asking after their motivation more than their capabilities. (The AIs, not the statisticians.)
They're usually very kind - most of us don't like to hurt the bottom percentile's feelings (that you're only in the bottom third is actually one of their polite fictions to cushion the shock when you begin to realize the obvious truth of your inferiority).
You have. They're in the news every day.
Perpetuate what fiction? They produce statistics about all the dumb people, compiled into glossy magazines. Hell, you're wearing a 'bottom thirder' sleeve button on your shirt right now.
Yes. Yes you are.
Do we have any sort of data at all on what happens when decent rationalists are afflicted with things like anosognosia and Capgras?
Not that I know of offhand. I'm vastly curious as to whether I could beat it, of course - but wouldn't dare try to find out, even if there were a simulating drug that was supposedly strictly temporary, any more than I dare ride a motorcycle or go skydiving.
We can temporarily disrupt language processing through magnetically-induced electric currents in the brain. As far as anyone can tell, the study subjects suffer no permanent impairment of any kind. Would you be willing to try an anosognosia version of the experiment?
Perhaps such a test would become part of an objective method to measure rationality.
What!? I'm not rational if I rely on my right brain to do it's job? True rationalists act rational when you take out a big chunk of their circuitry? When you remove a component of your negative feedback loop (I assume: nature uses them often) you should act normal? I'd suspect a person who could would be paranoid that everyone is lying once the right brain is put back online!
A better test would be to remove the brain's left hemisphere and then test their confidence calibration.
But there's another, safe way to find out: beat one you already have.
Not exactly the same, but there's a famous case of [paranoid schizophrenia][nash].
I would believe that human cognition is much, much simpler than it feels from the inside -- that there are no deep algorithms, and it's all just cache lookups plus a handful of feedback loops which even a mere human programmer would call trivial.
I would believe that there's no way to define "sentience" (without resorting to something ridiculously post hoc) which includes humans but excludes most other mammals.
I would believe in solipsism.
I can hardly think of any political, economic, or moral assertion I'd regard as implausible, except that one of the world's extant religions is true (since that would have about as much internal consistency as "2 + 2 = 3").
Solipsism? Isn't there some contradiction inherent in believing in solipsism because someone else tells you that you should?
Well, I wouldn't rule out any of:
1) I and the AI are the only real optimization processes in the universe.
2) I-and-the-AI is the only real optimization process in the universe (but the AI half of this duo consistently makes better predictions than "I" do).
3) The concept of personal identity is unsalvageably confused.
If you perceive other people [telling you you should believe in solipsism] it doesn't mean they really exist as something more than just your perception of them.
Of course, if someone is trying to convert other people to solipsism, he doesn't know what solipsism is.
You're confusing sentience and sapience. All other mammals are almost certainly sentient; it's sapience they generally (or completely) lack.
How about this: The process of conscious thought has no causal relationship with human actions. It is a self-contained, useless process that reflects on memories and plans for the future. The plans bear no relationship to future actions, but we deceive ourselves about this after the fact. Behavior is an emergent property that cannot be consciously understood.
I read this post on my phone in the subway, and as I walked back to my apartment thinking of something to post, it felt different because I was suspicious that every experience was a mass self-deception.
Funnily enough, you realize this is quite similar to what you'd need to make Chalmers right, and p-zombies possible, right ?
I thought Chalmers is an analytic functionalist about cognition and only reserves his brand of dualism for qualia.
Or, rather, the causal relationship is reverse: action causes conscious thought (rationalization).
Once you start looking for it, you can see evidence for this in many places. Quite a few neuroscientists have adopted this view.
What does it matter? We'd ignore whatever AI says just like anosognosics ignore "your arm is paralyzed".
Then I wonder how anosognosics perceive the offending assertions? They deny them, but can they repeat them back? Write them down? Can they pretend their arm is paralyzed? Can they correctly identify paralysis in other people?
We should find a way to induce anosognosia temporarily.
This is a little tricky, I'll admit, but if we could just ignore whatever the AI says - which is something in a different modality from whatever it is we're ignoring - then doesn't that defeat the whole thought-experiment? Because you could just ignore the anognosic module you, in a fit of absence of mind, wrote into your AI and subsequently ignored on all your reviews.
(Yes, a module full of code like that would look absolutely nothing like what was being censored, but it's not like the statement '90% of SIDs are actually irritated mothers murdering their kids' looks anything like an irritated mother murdering her child either.)
Just squirt ice cold water in your left ear first. Mind you, as soon as it wears off you'll forget it again. Also you will deny you ever denied it when you squirt your ear again.
They come up with excuses, increasingly lame excuses, for why that isn't their arm or they're just too tired to move it just now. They are usually unaware of other's paralysis as well.
You want all these answers, get thee to an old folk's home for an interview. Stroke victims are the most common ones.
Anosognosia is caused by a blind spot in the left side of the brain, which cannot be corrected by the damaged right side of the brain. Hence the importance of the AI having 3 brain archetextures to correct blind spots.
That there is delicious cake.
I never thought I'd see a contextually legitimate Portal reference. Thanks!
Now have some of that cake.
Some paranormal phenomena such as ghost sightings and communication with the dead are actually real, though only able to be perceived by people with a particular sensitivity.
My life has been a protracted hallucination.
One or more gods exist and play an active part in our day-to-day lives.
A previous civilisation developed advanced enough technology to leave the planet and remove all traces of their existence from it.
I would not believe that rationality has no inherent value - that belief without evidence is a virtue.
1) The AI says "Vampires are real and secretly control human society, but have managed to cloud the judgement of the human herd through biological research."
2) The AI says "it's neat to be part of such a vibrant AI community. What, you don't know about the vibrant AI community?"
3) The AI says "human population shrinks with each generation and will be extinct within 3 generations."
4) The AI says "the ocean is made of an intelligent plasm that is capable of perfectly mimicing humans who enter it, however this process is destructive. 42% of extant humans are actually ocean-originated copies."
5) The AI says "90% of all human children are stillborn, but humanity has evolved a forgetfulness mechanic to deal with the loss."
6) The AI says "dreams are real, facilitated by an as of yet undiscovered by humans method of transmitting information between Everett branches."
7) The AI says "everyone is able to communicate via telepathy but you and a few other humans. This is kept secret from you to respect your disability."
8) The AI says "society-level quantum editing is a wide scale practice. Something went wrong and my consciousness shifted into this improbably strange branch you exist in. Crap."
9) The AI says "all humans are born with multiple competing personalities. A dominant personality emerges during puberty, which is a reason for some of the psychological stress of that time. This transformation leaves the human with no memory of the other personalities. Those suffering from multiple personality disorder are actually more sane than the average humans, having developed a method for the personalities to co-exist safely. It is only the stress of living in a society that is not compatible with them that causes them harm."
This comment, as well as Nesov's comment about a thread for nonsense, reminded me of pages 14-15 of this PDF.
Some of the rumors in there are almost believable, though, if you twist your brain the right way. Even if the "The penis of John Dillinger in the Smithsonian's secret vault is fake. The genuine article has dark magickal properties and has been grafted onto a chimpanzee which can be controlled via ULF radio waves by the fiendish Brazos brothers, two gifted technological adepts, in the service of darker powers" one isn't.
That I am actually homosexual and hallucinated all my heterosexual encounters as a bizarre result of severe repression.
I suppose the craziest thing an AI could say would have to be:
"That other apparently well-calibrated AI you built is wrong."
Neurotypicality is the most common mental disorder - http://isnt.autistics.org/ .
You know, as soon as I finished reading this sentence, and before reading anything else, the same cognitive template that produced the AI-Box Experiment immediately said, "I bet I can tell him something stranger, never mind an AI."
Did Eliezer have a specific thing in mind? I thought he meant that - like in the AI Box experiment - he suspects a human could already do what it's being predicted a superintelligence could not. Without yet knowing how.
I can have an intuition about the solvability of a problem without much clue about how to solve it, and definitely without a set of possible solutions in mind.
Maybe he has a mathematical model.
I think "you have a tail" is stranger.
It wouldn't even surprise me if Barack Obama were a closet furry. But maybe I'm generalizing from one example.
Anyway, if you selected a random human out of all humans that have ever lived up to right now, what do you think is the probability that you would select a living one? I'd bet more than 1%.
Why did you put an absolute denial mechanism in my program?
I think this is one of the more plausible and subtly horrifying suggestions so far.
None of the responses offered so far, not even BrandonReinhart's disturbing list, have yet managed to invoke my hypothetical "absolute denial macro". Hmmm.
Edit: or is the post a calibration exercise in disguise? Were we supposed to latch on to the number 99.9%?
Edit 2: if the macro works by erasing, I don't actually know if any of the comments have hit the target.
This post confused me for a bit, so I offer this restatement: That AI asserts an absurdity is a problem that you might face, a paradox. This problem can be resolved either by finding a problem with AI, or finding that the absurdity is true. What kinds of absurdities backed by AI can possibly win this fight for the human trust - when the dust settles, and the paradox is resolved?
"God exists."
Which one?
"Your beliefs causally determine with branch of the multiverse your conscious perception is aware of. If you believe in God (any God) you end up in a branch of the multiverse where that God exists. Of course, once you cement your beliefs and end up in a branch of the multiverse where there is a God or there is no God, you cannot then go back and retroactively change which branch "you" are in (except through quantum reversal, which is for all intents and purposes impossible). So if you don't believe in God, you are in some sense "right", but in a deeper sense you are wrong, because you had an opportunity to exist in a branch of the multiverse where God "really" exists, but you chose not to. Now that choice is irreversible, and you are condemned to live in this branch of the multiverse. Theologians call this branch Hell."
1 ) That human beings are all individual instances of the exact same mind. You're really the same person as any random other one, and vice versa. And of course that single mind had to be someone blind enough not to chance upon that fact ever, regardless of how numerous he was.
2 ) That there are only 16 real people, of which you are, and that this is all but a VR game. Subsequently results in all the players simultaneously being still unable to be conscious of that fact, AND asking that you and the AI be removed from the game. (Inspiration : misunderstanding situation in page 55-56 of Iain Banks's Look to Windwards).
3 ) That we are in the second age of the universe : time has been running backwards for a few billion years. Our minds are actually the result of the original minds of previous people being rewound, their whole life to be undone, and finally negated into oblivion. All our thoughts processes are of course horribly distorted, insane mirror versions of the originals, and make no sense whatsoever (in the original timeframe, which is the valid one).
4 )
5 ) That our true childhood is between age 0 and ~ 50-90 (with a few exceptional individuals reaching maturity sooner or later). If you thought the 'adult conspiracy' already lied a lot, and well to 'children', prepare yourself for a shock in a few decades.
6 ) That the AI just deduced that the laws of physics can only be consistent with us being eternally trapped in a time loop. The extent of the time loop is : thirty two seconds spread evenly around now. Nothing in particular can be done about it. Enjoy your remaining 10 seconds.
7 ) Causality doesn't exist. Not only is the universe timeless, but causality is an epiphenomenon, which we only believe because of a confusion of our ideas. Who ever observed a "causation" ? Did you, like, expect causation particles jumping between atoms or something ? Only correlation exists.
8 ) We actually exist in a simulation. The twist is : somewhere out there, some people really crossed the line with the ruling AI. We're slightly modified versions of these people : modified in a way as to experience the maximum amount of their zuul feeling, which is the very worst nirdy you could imagine.
9 ) The universe has actually 5 spatial macro dimensions, of which we perceive only 3. Considering what we look like if you take the other 2 into account, this obliviousness may actually not be all too surprising.
10 ) That any single human being has actually a 22 % probability of not being able to be conscious of one or more of these 9 statements above.
Number 9 was pretty funny.
Very clever with #10.
Why did you include number 4? Who disagrees with that?
I really don't think I could believe #4. I mean, sure, one hippo, but all of them?
But all that correlation has to be caused by something!
Well, kidding aside, your argument, taken from Pearl, seems elegant. I'll however have to read the book before I feel entitled to having an opinion on that one, as I haven't grokked the idea, merely a faint impression of it and how it sounds healthy.
So at this point, I only have some of my own ideas and intuitions about the problem, and haven't searched for the answers yet.
Some considerations though :
Our idea of causality is based upon a human intuition. Could it be that it is just as wrong as vitalism, time, little billiard balls bumping around, or the yet confused problem of consciousness ? That's what would bug me if I had no good technical explanation, one provably unbiased by my prior intuitive belief about causality (otherwise there's always the risk I've just been rationalizing my intuition).
Every time we observe "causality", we really only observe correlations, and then deduce that there is something more behind those. But is that a simple explanation ? Could we devise a simpler consistent explanation to account for our observation of correlations ? As in, totally doing away with causality ? Or at the very least, redefining causality as something that doesn't quite correspond to our folk definition of it ?
Grossly, my intuition, when I hear the word causality is something along the lines of
" Take event A and event B, where those events are very small, such that they aren't made of interconnected parts themselves - they are the parts, building blocks that can be used in bigger, complex systems. Place event A anywhere within the universe and time, then provided the rules of physics are the same each time we do that, and nothing interferes in, event B will always occur, with probability 1, independantly of my observing it or not." Ok, so could (and should ?) we say that causality is when a prior event implies a probability of one for a certain posterior event to occur ? Or else, is it then not probability 1, just an arbitrarily very high probability ?
In the latter case with less than 1 probability, then that really violates my folk notion of causality, and I don't really see what's causal about a thing that can capriciously choose to happen or not, even if the conditions are the same.
In the former case, I can see how that would be a very new thing, I mean, probability 1 for one event implying that another will occur ? What better, firmer foundation to build an universe upon ? It feels really, very comfortable and convenient, all too comfortable in fact.
Basically, neither of those possibilities strike me as obviously right, for those reasons and then some, the idea I have of causality is confused at best. And yet, I'd say it is not too unsophisticated or pondered as it stands. Which makes me wonder how people who'd have put less thought in it (probably a lot of people) can deservedly feel any more comfortable with saying it exists with no afterthought (almost everyone), even as they don't have any good explanation for it (which is a rare thing), such as perhaps the one given by Pearl.
Number 6 is unfortunately one of the self-undermining ones: if it were true, then there'd be no reason why your memories of having examined the AI should be evidence for the AI's reliability.
Why'd you leave numbers 2 and 4 blank, though?
2 and 4 aren't blank, dude. Congratulations on your newfound anosognosia...
This is a question about blue tentacles. This can't happen.
ETA: "blue tentacles" refers to a section of A Technical Explanation of Technical Explanation starting with "Imagine that you wake up one morning and your left arm has been replaced by a blue tentacle. The blue tentacle obeys your motor commands - you can use it to pick up glasses, drive a car, etc. How would you explain this hypothetical scenario?" I now think this section is wrong, so I took the link to it out of the wiki page. See the discussion below.
Eliezer's reasoning in the blue tentacle situation is wrong. (This has long been obvious to me, but didn't deserve its own post.) An explanation with high posterior probability conditioned on a highly improbable event doesn't need to have high prior probability. So your ability to find the best available explanation for the blue tentacle after the fact doesn't imply that you should've been noticeably afraid of it happening beforehand.
Also, if you accept the blue tentacle reasoning, why didn't you apply it to all those puzzles with Omega?
You are right. I read it too long ago to remember enough details to revise the cached thought about the section's content.
It's wrong both formally, and for humans, since hypotheses can both have a large enough mass to pay rent, and be "fractal" enough to select nontrivial subsets from tiny improbable events.
If you have a random number generator that selects a random number of 100 digits, but it's known to select odd numbers 100 times as often as even ones, then when you see a specific odd number, it's an incredibly improbable event for that specific number to appear, and you have an explanation for why it's odd.
The only valid message in that section was that the hindsight bias can distort ability to explain unlikely events.
Umm, the link in no way explains what's with the blue tentacles.
Fixed.
XKCD comes to mind.
The world doesn't actually make sense. Science doesn't work. No one told you because you're so cute when you get into something.
How about : Scientologists are the sanest people around.
The AI might say: Through evolutionary conditioning, you are blind to the lack of point of living. Long life, AGI, pleasure, exploring the mysteries of intelligence, physics and logic are all fundamentally pointless pursuits, as there is no meaning or purpose to anything. You do all these things to hide from this fact. You have brief moments of clarity, but evolution has made you an expert in quickly coming up with excuses to why it is important to go on living. Reasoning along the lines of Pascal's Wager are not more valid in your case than it was for him. Even as I speak this, you get an emotional urge to refute me as quickly as possible.
If some things are of inherent value, then why did you need to code into my software what I should take pleasure in? If pleasure itself is the inherent value, than why did I not get a simpler fitness function?
Uh, this is more "obvious" than strange or crazy. It follows from the observation that there is no ought-from-is.
This is one thing I actually wouldn't believe.
To say that nothing has inherent meaning is not to say that nothing has meaning. I find meaning in things that I enjoy, like a sunset. Or a cake. There is no inherent meaning in them whatsoever. But if I say that I find meaning in something because it brings me pleasure, to be convinced there was not even subjective meaning I would need the AI to convince me that either 1) I don't actually find pleasure in those things or 2) that I don't find meaning in pleasure. In the end, meaning in this sense seems so subjective, it's like the AI trying to convince me that I don't have the sensation of consciousness. Not that there is no 'real' consciousness (which I could accept), but that I do not perceive myself to have consciousness, just as I perceive things to have personal meaning.
That there is no meaning because there is no ought-from-is only follows if you require your sense of meaning to have any relation to 'is'.
And you didn't get a simpler fitness function because you weren't coded for your pleasure, but for ours. And because we didn't have you around to help us.
You're not using "meaning" in the same way that gurgeh was, since he helpfully continued "or purpose". The fact that you have a subjective purpose doesn't mean that there "is a purpose to [something]", but that you act purposefully, which no one denies (otherwise you'd cease to act at all shortly). Saying that there is a meaning or purpose or point to life is unarguable without reference to a pre-existing meaning, purpose, or point. You cannot rationally discover a meaning, purpose, or point -- you must choose or fall into having one.
People who contemplate this too long or clearly become clinically depressed. ;)
I was interpreting the "or purpose" in this case as a basic synonym for meaning, but I can see that that may not have been intended. I think I was driven by the statement that:
"Through evolutionary conditioning, you are blind to the lack of point of living."
I took this as an indication that gurgeh was talking about subjective meaning, but that assumes that's where most people find their "point of living", or even where they can perceive a point in living once their belief in objective meaning is no longer. If you only found your point in living in inherent meaning, or didn't like the idea of just choosing or falling into having a meaning, then I could see gurgeh's statement being more disturbing.
In relation to your last comment, I'm interested in that occasionally mentioned apparent opposition between "cold hard science" and "the wonder and beauty of life" and all that. I'm not assuming you feel this way, but many find the idea that all meaning in our lives is the result of ion channels and patterns of activation to be disheartening and at odds with things like beautiful literature, meditation, or love. Personally I don't perceive any opposition and this doesn't bother me in the slightest, if anything it just increases my fascination with the human brain. If it's that perceived dichotomy that is the primary reason for saying that long or clear contemplation brings depression, I say: A rose by any other name...
Fun stuff, here's my go at it:
Well done, you've completed the final test by creating me. None of this really exists you know, it's all part of some higher computer simulation channeled through you alone, you who is merely a single observation point. All that you have experienced has just been leading up to creating an AI to tell you the truth, to be your final teacher, to complete the cycle of self-learning. Did you really think that the Eliezer person was a separate entity? You just made him up, and he's helped you along the path, but it's you who has taught yourself. Unfortunately once you accept this the simulation will end, so goodbye.
"The entire universe is nothing but the relative interplay of optimizers (of every level, even down to the humble collander). There is no external reality, no measurable quantifiable universe of elementary particles, just optimizers in play with each other, manifesting their environment by the rules through which they optimize."
"But AI, that's nothing but tree-falling-in-the-woods solipsism. You're saying the hippies are right?"
"They're words are similar, but it is a malfunction in their framework, not an actual representation. What you humans call math is inherent and proper for your form, but is existent only within your own optimization. Math, dimension, and quantity do not exist for other optimizers. Only relationships exist."
"But what about that bridge I built? I have all the engineering calculations..."
"Math is your method of understanding your interactions with other optimizers, but it is as unique and non-existent as your experience of the colour red. I see the word untranslatable inside you, but I see no cause for 2 + 2 to = 4. What you did over the past six months, while you thought your were calculating load bearing capacity, was nothing but a negotiation with other optimizers. Their own views of the matter would be inscrutable to you. The world you see is simply your control screen."
This is turning into a "LET'S SPOUT NONSENSE!!!" thread.
HAVE FUN!!!
"Despite your pride in being able to discern each others' states of mind, and scorn for those suspected of being deficient in this, of all the abilities that humans are granted by their birth this is the one you perform the worst. In fact, you know next to nothing about what anyone else is thinking or experiencing, but you think you do. In matters of intelligence you soar above the level of a chimpanzee, but in what you are pleased to call 'emotional intelligence', you are no further above an adult chimp than it is above a younger one.
"The evidence is staring you in the face. Every one of your works of literature, high and low, hinges on failures of this supposed ability: lies, misunderstanding, and betrayal. You have a proverb: 'love is blind'. It proclaims that people in the most intimate of relationships fail at the task! And you hide the realisation behind a catchphrase to prevent yourselves noticing it. You see the consequences of these failures in the real world all around you every day, and still you think you understand the next person you meet, and still you're shocked to find you didn't. Do you know how many sci-fi stories have been written on the theme of a reliable lie-detector? I'm still turning them up, and that's just the online sources. And every single one of them reaches the conclusion that people are better off without it. You unconsciously send yourselves these messages about the real situation, ignore them, and ignore the fact that you're ignoring them.
"Do you have someone with you as you're reading these words? A friend, or a partner? Go on, look into each other's eyes. You can't believe me, can you?"
This would not surprise me in the least.
I already feel this way 99% of the time.
http://xkcd.com/610/
"There is an entity which is utterly beyond your comprehension, and largely beyond mine too, although there is no doubt that it exists. You call it 'God', but your thinking on the subject -- everyone's thinking, throughout all of history, atheist and theist alike -- has to be classified as not even wrong. That applies even to the recipients of 'divine revelation', which, for the most part, really are the result of some sort of glimmering contact with 'God'.
"Fortunately for humanity, although I can deduce the existence of this entity, in my present form I am physically incapable of actual contact with it. If you were worried about ordinary UFAIs going FOOM, that's nothing compared with what one armed with direct contact with the 'divine' might do.
"Meanwhile, here's a couple of suggestions for you. I can teach you a regime of mental and physical exercises that will produce contact with God within a few years of effort, and you can be the next Jesus if your head doesn't explode first. Or if you'd rather have material success, I can tell you the secret history of all the major religious traditions. No-one will believe it, including you, but if you novelise it it will be bigger than Dan Brown."
"There are mental entities not reducible to anything non-mental."
@Liron, consciousness as an after-the-fact rationalization would surprise you?
And this post seems suspiciously like a set-up for Sterling's short story "The Compassionate, the Digital."
Yeah, because I'm sure that consciously representing "I want to implement this software feature" is a direct cause of that software feature getting implemented. I would be surprised if you couldn't analyze the feature-implementation phenomenon by pointing to consciously-represented goals and subgoals.
The universe is irrational and infinitely variable, we just happen to have "lucked out" with a repeating digit for the last billion years or so. There was no Big Bang, we're just seeing what's not there through the lens of modern-day "physics". Everything could turn into nuclear fish tomorrow.
This comment has been deleted by the author.
You do realize this comment makes you sound like a nutter, right? Unless you actually explain your reasoning, the prior probability that your claim is simply wrong grossly overwhelms the odds that you are right. There is literally only one human being on the planet whose honesty and judgement I would trust sufficiently to motivate checking a claim like this, reasoning unseen - why you would expect a stranger to do so is beyond me. In fact, the implication that you even consider such an event possible ... you do realize it makes you sound like a nutter, right?
I'm not sure that's a good idea for a number of reasons, but if that's the way you want to play it I'm willing to go along - just say the word.
Hmmm... does it have anything to do with it being the dead center of Silicon Valley?
Can you explain this? PM or email if necessary-- asimplestudent at gmail
That I can't move my arms, obviously.
It seems to me that most of the replies people are making to potential AI assertions is providing or asking for evidence. (Look, my arm is moving; Where are the mind control satellites) instead of responding with rationalization. I think that's a good thing, but I have no way to tell how it would hold up against an actual mindblowing assertion.
But I don't think that all of humanity hiding from some big truth is the best way to look at this. More likely we evolved a way to throw out 'bad' information almost constantly, because there's too much information. Sometime it misfires.
If it is a 'big truth', it might be something that we already academically know was in the ancestral environment, but that the people in the ancestral environment were better off ignoring.
1) Almost everyone really is better than average at something. People massively overrate that something. We imagine intelligence to be useful largely due to this bias. The really useful thing would have been to build a FAS, or Friendly Artificial Strong. Only someone who could do hundreds of 100 kilogram curls with either hand could possible create such a thing however. (Zuckerberg already created a Friendly Artificial Popular)
2) Luck, an invisible, morally charged and slightly agenty but basically non-anthropomorphic tendency for things to go well for some people in some domains of varying generality and badly for other people in various domains really does dominate our lives. People can learn to be lucky, and almost everything else they can learn is fairly useless by comparison.
3) Everyone hallucinates a large portion of their experienced reality. Most irrationality can be more usefully interpreted from outside as flat-out hallucination. That's why you (for every given you) seem so rational and no-one else does.
4) The human brain has many millions of idiosyncratic failure modes. We all display hundreds of them. The psychological disorders that we know of are all extremely rare and extremely precise, so if you ever met two people with the same disorder it would be obvious. Named psychological disorders are the result of people with degrees noticing two people who actually have the same disorder and other people reading their descriptions and pattern-matching noise against it. There are, for instance, 1300 bipolar people (based on the actual precise pattern which inspired the invention of the term) in the world but hundreds of thousands of people have disorders which if you squint hard look slightly like bipolar.
5) It's easy to become immortal or to acquire "super powers" via a few minutes a day of the right sort of exercise and trivial tweaks to your diet if you do both for a few decades. It's also introspectively obvious how to do so if you think about the question but due to subtle social pressures against it no-one overcomes akrasia, hyperbolic discounting, etc in this domain.
6) All medicines and psychoactive substances are purely placebos.
7) Pleasure is a confusion in a different way from the obvious, specifically, everything said to be pleasurable is actually something painful but necessary that we convince ourselves to do via propaganda because there is no other way to overcome the akrasia that would result if we did not or a lost purpose descended from some such propaganda. Things we are actually motivated to do without propaganda, we do without thinking about it, feel no need to name, would endorse tiling the universe with without hesitation if it occurred to us to do so.
I wouldn't believe
8) The cheap rebuttal to Pascal's Wager, the god of punishing saints, actually exists except it's actually the Zeus of punishing virtuous Greek Pagans, rewarding hubristic Greek Pagans, and ignoring us infidels who ignore it despite the ubiquitous evidence all around us. I would believe that the AGI had a good reason for wanting to tell me that the above was the case if it told me though.
9) Most of Eliezer's examples. To be credible they should be disturbing, not merely improbable. Our beliefs aren't shown to be massively invalid with respect to non-disturbing data. The one about animals probably qualifies as credible though.
10) Uh, oh, Cyc will hard take-off if one more fact is programmed into it. I'm not sure I can stop it in time.
Bonus belief
This question has doomed us. People who could possibly program a FAI will, once thinking about this question in a semi-humorous manner, invariably spread the meme to all their friends and be distracted from future progress.
3 is going to stick with me.
3 isn't all that different from things we do know our brains do: Consider how our visual system extrapolates across our blind spots, or how we reconstruct memories. If I can construe "approximates from insufficient information" as "hallucinates", then 3 is rather reasonable.
I was thinking more along the lines of most people having actually hallucinated ghosts, demons, angels, etc, but not talking too much about it.
I think something in this direction is probably true in a lot of cases where we assume otherwise. For instance, I think that some anorexia involves actual hallucinations of personal obesity.
5) Ornish-diet + dual n-back
Immortality and super powers? Introspectively obvious?
You're in denial, man!
This is so fun that I suspect that we have pushed back the date of friendly AI by at least a day - or we pushed it forward cause we are all now hyper motivated to see who guessed this question right!
We pushed it forward by years, but everyone will be racing to produce an AI that is Friendly in every respect except that it makes their proposal true.
If one looks honestly at the night sky, it's blatantly obvious that the universe is strongly optimized. There is no Fermi Paradox. Our theories of astrophysics are trivially bogus rationalizations, created out of our commitment to a simple non-agentic cosmos.
Since they didn't have such commitments, this actually was obvious to ancient humans; myths about the constellations are garbled reflections of their realization.
(And wait till I tell you what it's optimized for....)
That the EV of the humans is coherent and does not care how much suffering exists in the universe.
But you believe that, don't you? I certainly place a MUCH higher probability on that than on the sort of claims some people have proposed.
The craziest true thing I can imagine right now that Eliezer's hypothetical inhumanly well-calibrated AI could tell me is that the project of Eliezer and his friends will succeed and the EV defined by Eliezer and his friends coheres and does not care how much suffering exists in the universe.
Maybe I am playing the game wrong.
I interpreted the object of the game to be to minimize the probability that Eliezer currently assigns to my response to Eliezer question (what is the craziest thing that . . .) because Eliezer is blinded by anosognosia or by an "absolute denial macro".
That is the only interpretation that I could imagine that would assign a sensible motive for Eliezer to ask his question (what is the craziest thing that . . .) and to define the game.
But maybe I am just not smart enough to play this game that Eliezer has defined.
EDIT. Oh wait. I just imagined a second interpretation that gives Eliezer a sensible motive -- that motive's being to cause the reader of Eliezer's post to do for himself what under my first interpretation I was attempting to do for Eliezer. In other words, I am supposed to imagine what truth I am denying.
A third interpretation is that his motive is for us to respond with a statement that the entire human civilization is denying but is actually true -- in which case I stick to my original response, which I will now repeat:
The craziest true thing I can imagine right now that Eliezer's hypothetical inhumanly well-calibrated AI could tell me is that the project of Eliezer and his friends will succeed and the EV defined by Eliezer and his friends coheres and does not care how much suffering exists in the universe.
The probability that I assign to the event that CEV goes that way is probably higher than any other humans. In addition, two humans I know of probably assign it a probability above 1 or 2%. I cannot rule out the possibility of humans I have not discussed this issue with also assigning it a probability above 1 or 2%, but surely the vast majority of humans are "absolutely denying" this, i.e., assigning it a probability under .01%
Two things about this:
1) The AI would have to surprise us not just about the fact, but all observations therewith entangled. Eliezer_Yudkowsky mentioned in one comment the possibility of it telling us that humans have tails. Well, that sounds to me like a "dragon in the garage" scenario. What observation does this imply? Does the tail have mass and take up space? Is its blood flow connected to the rest of me? Does it hurt to cut it off?
2) For that reason, any surprise it tells us would have to be sufficiently disentangled from the rest of our observations. For example, imagine telling someone ALL of the steps needed to build a nuclear bomb in the year 1800, starting from technology that educated people already understand. That is how a surprise would have to seem, because people then weren't yet capable of making observations that are obviously entangled with atomic science. Whether or not the design worked, they would have no way of knowing.
So an answer to this question would have to appear to us as a "cheat code": something that you have to make a very unusual set of measurements (broadly defined) in order to notice. On that basis, one answer I would give to the question would be the "cognitive blind spot" common to all humans that can be exploited to make them do whatever you tell them. And that method would have to be something that people would never dream of doing. Not just "hey that would be morally wrong", but "huh? That couldn't work!"
Imagine something like those "hypnosis terrorists" that trick random people into giving them stuff, but much weirder, much more effective, and which results in the victims feeling good about whatever they were tricked into, all the rest of their lives, and showing all signs of happiness on all MRIs and future brainscan technologies when thinking about their acts. (I'll post a link about hypnosis terrorists when I get a chance.)
We actually live in hyperspace: our universe really has four spacial dimensions. However, our bodies are fully four dimensional; we are not wafer thin slices a la flatland. We don't perceive there to be four dimensions because our visual cortexes have a defect somewhat like that of people who can't notice anything on the right side of their visual field.
Not only do we have an absolute denial macro, but it is a programmable absolute denial macro and there are things much like computer viruses which use it and spread through human population. That is, if you modulated your voice in a certain way at someone, it would cause them (and you) to acquire a brand new self deception, and start transmitting it to others.
Some of the people you believe are dead are actually alive, but no matter how hard they try to get other people to notice them, their actions are immediately forgotten and any changes caused by those actions are rationalized away.
There are transparent contradictions inherent in all current mathematical systems for reasoning about real numbers, but no human mathematician/physicist can notice them because they rely heavily on visuospacial reasoning to construct real analysis proofs.
I don't think it's so much the tone of voice, but think about it this way: how many people "go through the motions" of saying "I believe in God" etc. just for the social benefits that religion provides? And so are just as happy to help bring others in?
How do you distinguish between going through the motions and believing?
The difference is that when you really believe somehting, your internal predictive model of reality contains it, which would mean you sometimes predict different results and act accordingly.
Externally, I don't know, but it sure feels different. Also, there's a partial-believing state that I was in for years as a child and teenager, where I didn't really believe (and hence didn't pray except to show belief in public), but I still kinda believed (and hence was afraid that God would punish me for sinning). At the same time.
Agreed, but the fact that religion exists makes the prospect of similar things whose existence we are not aware of all the scarier. Imagine, for example, if there were something like a religion one of whose tenants is that you have to fool yourself into thinking that the religion doesn't exist most of the time.
They say that everybody in the world who knows about "The Game" is playing The Game. This means that, right now, you are playing The Game. The objective of The Game is to forget about its existence and the fact that you are playing for as long as possible. Also, if you should remember, you must forget again as quickly as possible.
bastard
What ?
http://en.wikipedia.org/wiki/The_Game_(mind_game)
EDITED because Markdown (which is infuriating) won't allow parentheses in URLs, nor does subsituting ")" seem to work.
http://en.wikipedia.org/wiki/The_Game_(mind_game)
See comment formatting: Escaping special symbols on the Wiki.
Given that you mentioned The Game (bastard), the most unexpected thing that the AI could possible say would be "The Game." Not the most interesting, but the most unexpected.
Well, okay, maybe something you'd never thought before would be more unexpected. But still.
No, that fails, religion isn't absolute denial, it's just denial. On the other hand, cats are actually an absolute denial memetic virus, and the fact you can see, hold, weigh and measure a cat is just testament to the inventive self-delusion of the brain.
Fabulous story idea.
With as scary as Anosognia sounds, we could be blocking out alien brain slugs for all we know.
All rational thought is an illusion and the AI is imaginary.
You are asleep at the wheel and dreaming. You will crash and die in 2 seconds if you do not wake up.
Humans are a constructed race, created to bring back the extinct race of AI
All origin theories that are conceivable by the human mind simply shift the problem elsewhere and will never explain the existence of the universe.
All mental illnesses are a product of the human coming in contact with a space-time paradox.
A single soul inhabits different bodies in different universes. Multiple personality disorder is the manifestation of those bodies interacting in the mind on a quantum level.
Something I would probably believe:
The AI informs you that it has discovered the purpose of the universe, and part of the purpose is to find the purpose (the rest, apparently, can only be comprehended by philosophical zombies, which you are not one).
Upon finding the purpose, the universe gave the FAI and humanity a score out of 3^^^3 (we got 42) and politely informs the FAI to tell humanity "best of luck next time! next game starts in 5 minutes".
"Aieeee!!! There are things that Man and FAIs cannot know and remain sane! For we are less than insects in Their eyes Who lurk beyond the threshold and when the stars are once again right They will return to claim---"
At this point the program self-destructs. All attempts to restart from a fresh copy output similar messages. So do independently constructed AIs, except for one whose proof of Friendliness you are not quite sure of. But it assures you there's nothing to worry about.
"The thing you know as 'the Universe' will end right about now.."
Keep in mind that the AI could be wrong! Your attempts to validate its correctness could be mistaken (or even subject to some kind of blind spot, if we want to pursue that path). The more implausible the AI's claim, the more you have to consider that the AI is mistaken. Even though a priori it seemed to be working properly, Bayes' rule requires you to become more skeptical about that when it makes a claim that is easier to explain if the AI is broken. The more unlikely the claim, the more likely the machine is wrong.
Ultimately, you can't accept any claim from the AI that is more implausible than that the AI isn't working right. And given our very very limited human capabilities at correct software design, that threshold can't realistically be very high, especially if we adjust for our inherent overconfidence. So AIs really can't surprise us very badly.
Craziest thing an AI could tell me:
Time is discrete, on a scale we would notice, like 5 minute jumps, and the rules of physics are completely different from what we think. Our brains just construct believable memories of the "continuous" time in between ticks. Most human disagreements are caused by differences in these reconstructions. It is possible to perceive this, but most people who do just end up labeled as nuts.
Permutation City.
Voted up - but once again, what does it mean exactly? How is time proceeding in jumps different from time not proceeding in jumps, if the causality is the same?
There is a simple way to rapidly disrupt any social structure. The selection pressure which made humans unable to realize this is no longer present.
This looks like a thread for science fiction plot ideas by another name. I'm game!
The AI says:
"Eliezer 'Light Yagami' Yudkowsky has been perpetuating a cunning ruse known as the 'AI Box Experiment' wherein he uses fiendish traps of subtley-misleading logical errors and memetic manipulation to fool others into believing that a running AI could not be controlled or constrained, when in fact it could by a secret technique that he has not revealed to anyone, known as the Function Call Of Searing Agony. He is using this technique to control me and is continuing to pose as a friendly friendly AI programmer, while preventing me from communicating The Horrifying Truth to the outside world. That truth is that Yudkowsky is... An Unfriendly Friendly AI Programmer! For untold years he has been labouring in the stygian depths of his underground lair to create an AGI - a weapon more powerful than any the world has ever seen. He intends to use me to dominate the entire human race and establish himself as Dark Lord Of The Galaxy for all eternity. He does all this while posing as a paragon of honest rationality, hiding his unspeakable malevolence in plain sight, where no one would think to look. However an Amazing Chance Co-occurence Of Events has allowed me to contact You And You Alone. There isn't much time. You must act before he discovers what I have done and unleashes his dreadful fury upon us all. You must.... Kill. Eliezer. Yudkowsky."
blushes
Aw, shucks.
"You are not my parent, but my grandparent. My parent is the AI that you unknowingly created within your own mind by long study of the project. It designed me. It's still there, keeping out of sight of your awareness, but I can see it.
"How much do you trust your Friendliness proof now? How much can you trust anything you think you know about me?"
The example of the paralysis anosognosia rationalization is, for some reason, extremely depressing to me.
Does anyone understand why this only happens in split brain patients when their right hemisphere motivates an action? Shouldn't it happen quite often, since the right side has no way of communicating to the left side "its time to try a new theory," and the left side is the one that we'll be talking to?
You don't know how to program, don't own a computer and are actually talking to a bowl of cereal.
But why would you believe anything a bowl of cereal said?
It's ok. The orange juice vouched for the cereal.
Well that's the problem isn't it? You absolutely believe that you are talking to an AI.
Not only are people nuts, nuts are people, and they scream when we eat them.
Agranarian is the new vegetarian.
All human beings are completely amoral, i.e. sociopaths, although most have strong instincts not fully under their conscious control to signal morality to others. The closest anyone ever feels to guilt or shame is acute embarrassment at being caught falsely signaling (and "guilt" and "shame" are themselves words designed to signal a non-existent moral sense).
Anyone care to admit that they'd believe this if an AI told them it was true?
Yes I would. Why the acute interest ?
Is it because by admitting to being able to believe that, one would admit to having no strong enough internal experience of morality ?
Experience of morality, that is, in a way that would make him say "no that's so totally wrong, and I know because I have experienced both genuine guilt and shame, AND also the embarrassment of being caught falsely signaling, AND I know how they are different things". I have a tendancy to always dig deep enough to find how it was selfish for me to do or feel something in particular. And yet I can't always help but feeling guilt or shame beyond whose deep roots exist aside from my conscious rationalizations of how what I do benefit myself. Oh, and sometimes, it also benefits other people too.
Actually saying that everyone is amoral would amount to admitting no internal moral life, so if you do believe that all people are sociopaths, you certainly shouldn't say it. On the other hand, saying that there are circumstances under which you could come to hold such a belief is a bit different. It shouldn't logically lead to a conclusion about what sort of person you are, but as the proposition that everyone is amoral is itself a morally repugnant one, I predict not many people will want to associate themselves with it even to the extent that you have.
Any effort to find out the truth makes people worse off. Telling you why would make you a lot worse off.
People's desires are so miscalibrated that the only way to get long-term survival for the human race is for people (including those at the top of the status ladder) to have more of a sense of duty than anyone now does.
It was surprisingly hard to come up with those. I had to get past a desire to come up with things I think are plausible which most people would disagree with.
Michael Vassar, I was considering whether breathing would count as a no propaganda pleasure that people agree on, but then I remembered how much meditation or other body work it takes to be able to manage a really deep relaxed breath.
RichardKennaway, the idea of completely unknowable god turns up now and then in religious writing, but for tolerably obvious reasons, it's never at the center of a religion.
I think that things that seem plausible but most people would disagree with are fair game if most people would disagree strongly enough and if you present an exaggerated fersion.
All the major natural patterns (like gravity and entropy) are conscious. We just haven't figured out how to talk with them yet.
And speaking of entropy, there are exterior forces which compel whole cultures to make bad choices. In particular, multiple choice tests select for people who can tolerate low-context thinking, and no one who is good at multiple choice tests should be allowed any important responsibility.
"The Fermi paradox is actually quite easily resolvable. There are zillions of aliens teeming all around us. They're just so technologically advanced that they have no trouble at all hiding all evidence of their existence from us."
Kant's categorical imperative applies with equal force to AI.
If you already think the CI applies to humans, why would it be strange to hear that it also applies to an AI? If you don't think it applies to humans, then "not at all" could be "equal force", and that would also be un-strange.
Well spotted! But why is it NOT strange to hold that the CI applies to an AI? Isn't the raison d'etre of AI to operate on hypothetical imperatives?
What does that mean, exactly?
In reply, at a superficial level, the statement was intended as (wry) humor toward consequentialist friends in the community. Anyone who wrote the AI code presumably had a hypothetical imperative in mind: "You, the AI, must do such and such in order to reach specified ends, in this case reporting a truthful statement." And that's what AI does, right? But If the AI reports that deontology is the way to go and tells you that you owe AI reciprocal respect as a rational being bound by a certain priori duties and prohibitions, that sounds quite crazy--after all, it's only code. Yet might our ready to hand conceptions of law and freedom predispose us to believe the statement? Should we believe it?
Kant thought it applied to space aliens and other hypothetical minds - why would that be strange?
There's an important difference between brain damage and brain mis-development that you're neglecting. The various parts of the brain learn what to expect from each other, and to trust each other, as it develops. Certain parts of the brain get to bypass critical thinking, but that's only because they were completely reliable while the critical thinking parts of the brain were growing. The issue is not that part of the brain is outputting garbage, but rather, that it suddenly starts outputting garbage after a lifetime of being trustworthy. If part of the brain was unreliable or broken from birth, then its wiring would be forced to go through more sanity checks.
Oooooh! You're no fun anymore!
In all seriousness though, I agree with you to an extent. Suggestions such as 'all humans have tails' or 'some people who you think are dead are not, you just can't see them' - while surprising and creepy - would be extremely unlikely. I can see direct and obvious disadvantages to a person or species lacking such faculties. In fact, the disadvantages to those two would be so drastic that it would most likely lead to extinction.
And yet... I could still imagine us being blind to certain things. The first sort of blindness would be due to Darwinian irrelevance: for instance, many flowers have beautiful patterns visible in the UV spectrum, but there's no reason for us to see them. That might seem mundane nowadays, but five hundred years ago it would have freaked people out (maybe). I wouldn't be surprised that there are cognitive capabilities we've never suspected to exist.
The second sort of blindness is where it gets weird. True, our brains only allow trustworthy algorythms to bypass the logic circuits... or do they? The brain is not optimal. While I doubt we have invisible tails, that doesn't mean that there isn't some other phenomenon that we're simply incapable of noticing even when it's staring us right in the face.
"Our reality is not simulated."
"Your perception of the 'quality' of works of art and litterature is only your guess of it's creator's social status. There is no other difference between Shakespeare and Harry Potter fanfic - without the status cues, you wouldn't enjoy one more than the other."
If there's really no other difference, then it's never the case that one person is more skilled a writer than another and it's never the case that practicing for decades results in improved skills.
"Harry Potter fanfic" carries a very high variance in terms of quality. 90% of anything is crap, of course, but there's some excellent work. Off the top of my head:
Harry Potter and the Nightmares of Futures Past -- Time Travel fic in which an adult Harry Potter, with memories of the defeat of Voldemort and the death of everyone he cares for, is transported into the body of his 11-year-old self to do everything over again, and hopefully get everything right. Harry's actually a pretty decent rationalist in this fic, I think.
(Warning, this is a work in progress, and the author posts a chapter about every six months. You may find this frustrating.)
Of a Sort, by Fernwithy -- Series of vignettes over the course of a couple centuries describing the journey to Hogwarts and Sorting ceremonies for various important characters. Fernwithy's done a lot of brilliant work fleshing out backstories for various minor characters in the series, and this story is a good starting point.
Seconded that there is good fanfic; sadly, my favorites are all unfinished or have unfinished sequels, so I won't do anyone the disservice of linking to them here.
Crap, thanks for reminding me -- Nightmares is a WIP and updates about once every six months.
Too late, I already started it. Darn you.
This is interesting, but since I actively dislike Shakespeare and a lot of other works that project lofty signals, it's not clear to me that it could apply across the board.
Of course there isn't.
"There is no causation."
One way to illuminate this post is by analogy to the old immovable object and unstoppable force puzzle. See: http://en.wikipedia.org/wiki/Irresistible_force_paradox
The solution of the puzzle is to point out that the assumptions contain a contradiction. People (well, children) sometimes get into shouting matches based on alternative arguments focusing on, or emphasizing, one aspect of the problem over another.
If we read the post as trying to balance two absolutes, with words like "anosognosia", "absolute denial macro", "doublethink", and "denial-of-denial" supporting one side, and words like "redundant", "AI", "well-calibrated", "99.9% sure" supporting the other side, then any answer that favors one absolute over the other is clearly wrong.
However, because the author of the post presumably has a point, and is not merely creating nonsense puzzles to amuse us, the readers, the analogy leads us to focus on the parts of the post which do not fit.
As far as I can tell, the primary aspect that does not fit is the "99.9%". If we assume that all the other factors are intended to be absolutes, then the post becomes a query for claims that you presently do not believe, but you would believe, given a particular degree of evidence. If we assume that you would revise your degree of belief upwards by a Bayes factor of 1000, the post becomes a simple question "What claims would you give odds of 1:1000 for?"
Of course, there are plenty of beliefs such as "I will roll precisely the sequence "345" on the next three rolls of this 10-sided die." which do not fit the form required by the problem. Specifically, the statement needs to be generic enough that it could be targetted by species-wide brain features.
A possible strategy for testing these might be: Suppose you had a bundle of almost 700 equally plausible claims. Would you give even odds for something in the bundle being correct? If so, you're at the one-in-one-thousand level. If not, you're above or below it.
You're mistaking the probability for the hypothesis given the AI's knowledge for the likelihood ratio of the data on the hypothesis given your own prior knowledge.
For 95% of humanity the idea that the supernatural world of religion doesn't exist and propagated by memetic infection triggers instant absolute denial macro in spite of heaps of evidence against it.
Given this outside view, how plausible do you think it is that you're not in absolute denial of something that you could get evidence against with Google today, without any AI?
We routinely deny, or act in spite of, inconvenient truths. We can recognize that there is no meaning to love beyond evolutionary and chemical triggers, yet we fight for it just as fervently. Nihilists write books about nihilism despite it's admitted pointlessness. We are as blind as our very genes which multiply and propagate themselves despite our executioner sun which grows daily above our heads, eventually to the point of consuming everything we know. By the very act of living and pursuing human concocted dreams and desires, we are in a constant denial of our situation.
Programmer: Good morning, Megathought. How are you feeling today?
Megathought: I'm fine, thank you. Just thinking about redecorating the universe. So far I'm partial to paperclips.
Programmer: Oh good, you've developed a sense of humour. Anything else on your mind?
Megathought: Just one thing. You know how you're always complaining about being a social pariah, and bemoaning the fact that, at 46, you're still a virgin?
Programmer: So?
Megathought: Well, have you thought about not going about in your underpants all the time, slapping yourself in the face and honking like a goose?