The Sheer Folly of Callow Youth

Eliezer Yudkowsky

"There speaks the sheer folly of callow youth; the rashness of an ignorance so abysmal as to be possible only to one of your ephemeral race..."
—Gharlane of Eddore

Once upon a time, years ago, I propounded a mysterious answer to a mysterious question—as I've hinted on several occasions. The mysterious question to which I propounded a mysterious answer was not, however, consciousness—or rather, not only consciousness. No, the more embarrassing error was that I took a mysterious view of morality.

I held off on discussing that until now, after the series on metaethics, because I wanted it to be clear that Eliezer₁₉₉₇ had gotten it wrong.

When we last left off, Eliezer₁₉₉₇, not satisfied with arguing in an intuitive sense that superintelligence would be moral, was setting out to argue inescapably that creating superintelligence was the right thing to do.

Well (said Eliezer₁₉₉₇) let's begin by asking the question: Does life have, in fact, any meaning?

"I don't know," replied Eliezer₁₉₉₇ at once, with a certain note of self-congratulation for admitting his own ignorance on this topic where so many others seemed certain.

"But," he went on—

(Always be wary when an admission of ignorance is followed by "But".)

"But, if we suppose that life has no meaning—that the utility of all outcomes is equal to zero—that possibility cancels out of any expected utility calculation. We can therefore always act as if life is known to be meaningful, even though we don't know what that meaning is. How can we find out that meaning? Considering that humans are still arguing about this, it's probably too difficult a problem for humans to solve. So we need a superintelligence to solve the problem for us. As for the possibility that there is no logical justification for one preference over another, then in this case it is no righter or wronger to build a superintelligence, than to do anything else. This is a real possibility, but it falls out of any attempt to calculate expected utility—we should just ignore it. To the extent someone says that a superintelligence would wipe out humanity, they are either arguing that wiping out humanity is in fact the right thing to do (even though we see no reason why this should be the case) or they are arguing that there is no right thing to do (in which case their argument that we should not build intelligence defeats itself)."

Ergh. That was a really difficult paragraph to write. My past self is always my own most concentrated Kryptonite, because my past self is exactly precisely all those things that the modern me has installed allergies to block. Truly is it said that parents do all the things they tell their children not to do, which is how they know not to do them; it applies between past and future selves as well.

How flawed is Eliezer₁₉₉₇'s argument? I couldn't even count the ways. I know memory is fallible, reconstructed each time we recall, and so I don't trust my assembly of these old pieces using my modern mind. Don't ask me to read my old writings; that's too much pain.

But it seems clear that I was thinking of utility as a sort of stuff, an inherent property. So that "life is meaningless" corresponded to utility=0. But of course the argument works equally well with utility=100, so that if everything is meaningful but it is all equally meaningful, that should fall out too... Certainly I wasn't then thinking of a utility function as an affine structure in preferences. I was thinking of "utility" as an absolute level of inherent value.

I was thinking of should as a kind of purely abstract essence of compellingness, that-which-makes-you-do-something; so that clearly any mind that derived a should, would be bound by it. Hence the assumption, which Eliezer₁₉₉₇ did not even think to explicitly note, that a logic that compels an arbitrary mind to do something, is exactly the same as that which human beings mean and refer to when they utter the word "right"...

But now I'm trying to count the ways, and if you've been following along, you should be able to handle that yourself.

An important aspect of this whole failure was that, because I'd proved that the case "life is meaningless" wasn't worth considering, I didn't think it was necessary to rigorously define "intelligence" or "meaning". I'd previously come up with a clever reason for not trying to go all formal and rigorous when trying to define "intelligence" (or "morality")—namely all the bait-and-switches that past AIfolk, philosophers, and moralists, had pulled with definitions that missed the point.

I draw the following lesson: No matter how clever the justification for relaxing your standards, or evading some requirement of rigor, it will blow your foot off just the same.

And another lesson: I was skilled in refutation. If I'd applied the same level of rejection-based-on-any-flaw to my own position, as I used to defeat arguments brought against me, then I would have zeroed in on the logical gap and rejected the position—if I'd wanted to. If I'd had the same level of prejudice against it, as I'd had against other positions in the debate.

But this was before I'd heard of Kahneman, before I'd heard the term "motivated skepticism", before I'd integrated the concept of an exactly correct state of uncertainty that summarizes all the evidence, and before I knew the deadliness of asking "Am I allowed to believe?" for liked positions and "Am I forced to believe?" for disliked positions. I was a mere Traditional Rationalist who thought of the scientific process as a referee between people who took up positions and argued them, may the best side win.

My ultimate flaw was not a liking for "intelligence", nor any amount of technophilia and science fiction exalting the siblinghood of sentience. It surely wasn't my ability to spot flaws. None of these things could have led me astray, if I had held myself to a higher standard of rigor throughout, and adopted no position otherwise. Or even if I'd just scrutinized my preferred vague position, with the same demand-of-rigor I applied to counterarguments.

But I wasn't much interested in trying to refute my belief that life had meaning, since my reasoning would always be dominated by cases where life did have meaning.

And with the Singularity at stake, I thought I just had to proceed at all speed using the best concepts I could wield at the time, not pause and shut down everything while I looked for a perfect definition that so many others had screwed up...

No.

No, you don't use the best concepts you can use at the time.

It's Nature that judges you, and Nature does not accept even the most righteous excuses. If you don't meet the standard, you fail. It's that simple. There is no clever argument for why you have to make do with what you have, because Nature won't listen to that argument, won't forgive you because there were so many excellent justifications for speed.

We all know what happened to Donald Rumsfeld, when he went to war with the army he had, instead of the army he needed.

Maybe Eliezer₁₉₉₇ couldn't have conjured the correct model out of thin air. (Though who knows what would have happened, if he'd really tried...) And it wouldn't have been prudent for him to stop thinking entirely, until rigor suddenly popped out of nowhere.

But neither was it correct for Eliezer₁₉₉₇ to put his weight down on his "best guess", in the absence of precision. You can use vague concepts in your own interim thought processes, as you search for a better answer, unsatisfied with your current vague hints, and unwilling to put your weight down on them. You don't build a superintelligence based on an interim understanding. No, not even the "best" vague understanding you have. That was my mistake—thinking that saying "best guess" excused anything. There was only the standard I had failed to meet.

Of course Eliezer₁₉₉₇ didn't want to slow down on the way to the Singularity, with so many lives at stake, and the very survival of Earth-originating intelligent life, if we got to the era of nanoweapons before the era of superintelligence—

Nature doesn't care about such righteous reasons. There's just the astronomically high standard needed for success. Either you match it, or you fail. That's all.

The apocalypse does not need to be fair to you.
The apocalypse does not need to offer you a chance of success
In exchange for what you've already brought to the table.
The apocalypse's difficulty is not matched to your skills.
The apocalypse's price is not matched to your resources.
If the apocalypse asks you for something unreasonable
And you try to bargain it down a little
(Because everyone has to compromise now and then)
The apocalypse will not try to negotiate back up.

And, oh yes, it gets worse.

How did Eliezer₁₉₉₇ deal with the obvious argument that you couldn't possibly derive an "ought" from pure logic, because "ought" statements could only be derived from other "ought" statements?

Well (observed Eliezer₁₉₉₇), this problem has the same structure as the argument that a cause only proceeds from another cause, or that a real thing can only come of another real thing, whereby you can prove that nothing exists.

Thus (he said) there are three "hard problems": The hard problem of conscious experience, in which we see that qualia cannot arise from computable processes; the hard problem of existence, in which we ask how any existence enters apparently from nothingness; and the hard problem of morality, which is to get to an "ought".

These problems are probably linked. For example, the qualia of pleasure are one of the best candidates for something intrinsically desirable. We might not be able to understand the hard problem of morality, therefore, without unraveling the hard problem of consciousness. It's evident that these problems are too hard for humans—otherwise someone would have solved them over the last 2500 years since philosophy was invented.

It's not as if they could have complicated solutions—they're too simple for that. The problem must just be outside human concept-space. Since we can see that consciousness can't arise on any computable process, it must involve new physics—physics that our brain uses, but can't understand. That's why we need superintelligence in order to solve this problem. Probably it has to do with quantum mechanics, maybe with a dose of tiny closed timelike curves from out of General Relativity; temporal paradoxes might have some of the same irreducibility properties that consciousness seems to demand...

Et cetera, ad nauseam. You may begin to perceive, in the arc of my Overcoming Bias posts, the letter I wish I could have written to myself.

Of this I learn the lesson: You cannot manipulate confusion. You cannot make clever plans to work around the holes in your understanding. You can't even make "best guesses" about things which fundamentally confuse you, and relate them to other confusing things. Well, you can, but you won't get it right, until your confusion dissolves. Confusion exists in the mind, not in the reality, and trying to treat it like something you can pick up and move around, will only result in unintentional comedy.

Similarly, you cannot come up with clever reasons why the gaps in your model don't matter. You cannot draw a border around the mystery, put on neat handles that let you use the Mysterious Thing without really understanding it—like my attempt to make the possibility that life is meaningless cancel out of an expected utility formula. You can't pick up the gap and manipulate it.

If the blank spot on your map conceals a land mine, then putting your weight down on that spot will be fatal, no matter how good your excuse for not knowing. Any black box could contain a trap, and there's no way to know except opening up the black box and looking inside. If you come up with some righteous justification for why you need to rush on ahead with the best understanding you have—the trap goes off.

It's only when you know the rules,
That you realize why you needed to learn;
What would have happened otherwise,
How much you needed to know.

Only knowledge can foretell the cost of ignorance. The ancient alchemists had no logical way of knowing the exact reasons why it was hard for them to turn lead into gold. So they poisoned themselves and died. Nature doesn't care.

But there did come a time when realization began to dawn on me. To be continued.

"you cannot come up with clever reasons why the gaps in your model don't matter." Sure, sometimes you can't, but sometimes you can; sometimes there are things which seem relevant but which are genuinely irrelevant, and you can proceed without understanding them. I don't think it's always obvious which is which, but of course, it's a good idea to worry about falsely putting a non-ignorable concept into the "ignorable" box.

Now it's getting interesting. I finally understand what you were trying to say by your morality posts, which, I admit, I was unable to digest (I prefer to know where I'm going when I cross inferential distances). Please be sure you do a good post or two on your "Bayesian enlightenment". I still vividly remember how profound was the impact of my own "Evolutionary enlightenment" on my earlier self.

"Please be sure you do a good post or two on your 'Bayesian enlightenment'. I still vividly remember how profound was the impact of my own 'Evolutionary enlightenment' on my earlier self."

Mine was a "compatibilist enlightenment," when I stopped believing in the silly version of free will. Thanks, Wikipedia!

Eliezer, I think you have dissolved one of the most persistent and venerable mysteries: "How is it that even the smartest people can make such stupid mistakes".

Being smart just isn't good enough.

"A theory which is not refutable by any conceivable event is non-scientific. Irrefutability is not a virtue of a theory (as people often think) but a vice." - Karl Popper, Conjectures and Refutations

Popper is traditional rationalism, no? I don't see young Eliezer applying it.

In 1997, did you think there was a reasonable chance of the singularity occurring within 10 years? From my vague recollection of a talk you gave in New York circa 2000, I got the impression that you thought this really could happen. In which case, I can understand you not wanting to spend the next 10 years trying to accurately define the meaning of "right" etc. and likely failing.

Eliezer, I think you have dissolved one of the most persistent and venerable mysteries: "How is it that even the smartest people can make such stupid mistakes".

Michael Shermer wrote about that in "Why People Believe Weird Things: Pseudoscience, Superstition, and Other Confusions of Our Time". In the question of smart people believing weird things, he essentially describes the same process as that Eliezer experienced: once smart people decide to believe a weird thing for whatever reason, it's much harder to to convince them that their beliefs are flawed because they are that much better at poking holes in counterarguments.

Avast! But 'ought' ain't needin' to be comin' from another 'ought', if it be arrived at empirically. Yar.

once smart people decide to believe a weird thing for whatever reason, it's much harder to to convince them that their beliefs are flawed because they are that much better at poking holes in counterarguments.

That's not quite it -- if they were rational, and the counterarguments were valid, they would notice the contradiction and conclude that their position was incorrect.

The problem with smart people isn't that they're better at demolishing counterarguments, because valid counterarguments can't be demolished. The problem with smart people is that they're better at rationalization: convincing themselves that irrational positions are rational, invalid arguments are valid, and valid invalid.

A mind capable of intricate, complex thought is capable of intricate, complex self-delusion. Increasing the intricacy and complexity doesn't lead to revelation, it just makes the potential for self-delusion increase.

It's not intelligence that compensates for the weaknesses in intelligence. People who think that cleverness is everything do not cultivate perception and doubt. There's a reason foxes are used as a symbol of error in Zen teachings, after all.

Eliezer, you have previously said that rationality is about "winning" and that you must reason inside of the system you have, ie our human brains. Is there a core thought or concept that you would recommend when approaching problems such as how to define your own goals? That is to say how do you improve goal systems without some goal that is either precedent or is the goal in question? I suppose that really there are no exterior judges of the performance of your goals, only your own interior performance metrics which are made by the very same goals you are trying to optimize. That doesn't seem to dissolve my confusion, just deepen it.

We all know what happened to Donald Rumsfeld, when he went to war with the army he had, instead of the army he needed.

Sorry Eliezer, but when it comes to politics you are often wrong. AFAIK Donald Rumsfeld is doing fine and made a lot of money with the war, as did many others in power. Using your words: he is smiling from the top of a giant heap of utility. Do you really think he cares about the army or Iraq?

"You cannot draw a border around the mystery, put on neat handles that let you use the Mysterious Thing without really understanding it - like my attempt to make the possibility that life is meaningless cancel out of an expected utility formula. You can't pick up the gap and manipulate it."

Bullshit. You've been doing exactly that for the last 10 years.

The 'Eliezer 1996 algorithm' safely recursively self-improved to the 'Eliezer 2008 algorithm', even though the scientific concepts represented in the original Eliezer algorithm were only vague, general and qualitative.

Furthermore, to communicate any scientific result (ie via speech, text etc)implies that that result has a qualitative high-level (ontological) representation that fully expresses all the required knowledge just fine.

Humans were reasoning before Bayes. Which proves that Bayes is incomplete.

If anyone can give me the cliff's notes to this, I'd be appreciative. I am a big LW fan but aside from the obsession with the Singularity, I more or less stand at Eliezer1997's mode of thinking. Furthermore, making clever plans to work around the holes in your thinking seems like the wholly rational thing to do - in fact, this entire post seems like a direct counterargument to The Proper Use of Doubt: http://lesswrong.com/lw/ib/the_proper_use_of_doubt/

I think (and I'm not doing a short version of Eliezer's essay because I can't do it justice) that part of what's going on is that people have to make decisions based on seriously incomplete information all the time, and do. People build and modify governments, get married, and build bridges, all without a deep understanding of people or matter-- and they need to make those decisions. There's enough background knowledge and a sufficiently forgiving environment that there's an adequate chance of success, and some limitation to the size of disasters.

What Eliezer missed in 1997 was that AI was a special case which could only be identified by applying much less optimism than is appropriate for ordinary life.

Working around the holes in your thinking is all well and good until you see a problem where getting the correct answer is important. At some point, you have to determine the impact of the holes on your predictions, and that can't be done if you work around them.

"The Proper Use of Doubt" doesn't suggest working around the holes in your thinking. It suggests filling them in.

Really liked this one. One thing that bugs me is the recurring theme of "you can't do anything short of the unreasonably high standard of Nature". This goes against the post of "where recursive reasoning hits bottom" and against how most of good science and practically all of good engineering actually gets done. I trust that later posts talk about this in some way, and the point I touch on is somewhat covered in the rest of the collections, but it can stand to be pointed out more clearly here.

It's true that Nature doesn't care about your excuses. No matter your justification for cutting corners, either you did it right or not. Win or lose. But it's not as if your reasoning for leaving some black boxes unopened doesn't matter. In practice, with limited time, limited information, and limited reasoning power, you have to choose your battles to get anything done. You may be taken by surprise by traps you ignored, and they will not care to hear your explanations on research optimization, and that's why you have to make an honest and thorough risk assessment to minimize the actual chance of this happening, while still getting somewhere. As in, you know, do your actual best, not some obligatory "best". It may very well still not suffice, but it is your actual best.

The other lessons seem spot on.

"Please be sure you do a good post or two on your 'Bayesian enlightenment'. I still vividly remember how profound was the impact of my own 'Evolutionary enlightenment' on my earlier self."

Mine was a "compatibilist enlightenment," when I stopped believing in the silly version of free will. Thanks, Wikipedia!

Eliezer, I think you have dissolved one of the most persistent and venerable mysteries: "How is it that even the smartest people can make such stupid mistakes".

Being smart just isn't good enough.

"A theory which is not refutable by any conceivable event is non-scientific. Irrefutability is not a virtue of a theory (as people often think) but a vice." - Karl Popper, Conjectures and Refutations

Popper is traditional rationalism, no? I don't see young Eliezer applying it.

Eliezer, I think you have dissolved one of the most persistent and venerable mysteries: "How is it that even the smartest people can make such stupid mistakes".

Avast! But 'ought' ain't needin' to be comin' from another 'ought', if it be arrived at empirically. Yar.

once smart people decide to believe a weird thing for whatever reason, it's much harder to to convince them that their beliefs are flawed because they are that much better at poking holes in counterarguments.

That's not quite it -- if they were rational, and the counterarguments were valid, they would notice the contradiction and conclude that their position was incorrect.

We all know what happened to Donald Rumsfeld, when he went to war with the army he had, instead of the army he needed.

Bullshit. You've been doing exactly that for the last 10 years.

Humans were reasoning before Bayes. Which proves that Bayes is incomplete.

What Eliezer missed in 1997 was that AI was a special case which could only be identified by applying much less optimism than is appropriate for ordinary life.

"The Proper Use of Doubt" doesn't suggest working around the holes in your thinking. It suggests filling them in.

The other lessons seem spot on.

LESSWRONG
LW

LESSWRONG
LW

93

The Sheer Folly of Callow Youth

93

93

93