In yesterday's episode, Eliezer2001 is fighting a rearguard action against the truth.  Only gradually shifting his beliefs, admitting an increasing probability in a different scenario, but never saying outright, "I was wrong before."  He repairs his strategies as they are challenged, finding new justifications for just the same plan he pursued before.

(Of which it is therefore said:  "Beware lest you fight a rearguard retreat against the evidence, grudgingly conceding each foot of ground only when forced, feeling cheated.  Surrender to the truth as quickly as you can.  Do this the instant you realize what you are resisting; the instant you can see from which quarter the winds of evidence are blowing against you.")

Memory fades, and I can hardly bear to look back upon those times—no, seriously, I can't stand reading my old writing.  I've already been corrected once in my recollections, by those who were present.  And so, though I remember the important events, I'm not really sure what order they happened in, let alone what year.

But if I had to pick a moment when my folly broke, I would pick the moment when I first comprehended, in full generality, the notion of an optimization process.  That was the point at which I first looked back and said, "I've been a fool."

Previously, in 2002, I'd been writing a bit about the evolutionary psychology of human general intelligence—though at the time, I thought I was writing about AI; at this point I thought I was against anthropomorphic intelligence, but I was still looking to the human brain for inspiration.  (The paper in question is "Levels of Organization in General Intelligence", a requested chapter for the volume "Artificial General Intelligence", which finally came out in print in 2007.)

So I'd been thinking (and writing) about how natural selection managed to cough up human intelligence; I saw a dichotomy between them, the blindness of natural selection and the lookahead of intelligent foresight, reasoning by simulation versus playing everything out in reality, abstract versus concrete thinking.  And yet it was natural selection that created human intelligence, so that our brains, though not our thoughts, are entirely made according to the signature of natural selection.

To this day, this still seems to me like a reasonably shattering insight, and so it drives me up the wall when people lump together natural selection and intelligence-driven processes as "evolutionary". They really are almost absolutely different in a number of important ways—though there are concepts in common that can be used to describe them, like consequentialism and cross-domain generality.

But that Eliezer2002 is thinking in terms of a dichotomy between evolution and intelligence tells you something about the limits of his vision—like someone who thinks of politics as a dichotomy between conservative and liberal stances, or someone who thinks of fruit as a dichotomy between apples and strawberries.

After the "Levels of Organization" draft was published online, Emil Gilliam pointed out that my view of AI seemed pretty similar to my view of intelligence.  Now, of course Eliezer2002 doesn't espouse building an AI in the image of a human mind; Eliezer2002 knows very well that a human mind is just a hack coughed up by natural selection.  But Eliezer2002 has described these levels of organization in human thinking, and he hasn't proposed using different levels of organization in the AI.  Emil Gilliam asks whether I think I might be hewing too close to the human line.  I dub the alternative the "Completely Alien Mind Design" and reply that a CAMD is probably too difficult for human engineers to create, even if it's possible in theory, because we wouldn't be able to understand something so alien while we were putting it together.

I don't know if Eliezer2002 invented this reply on his own, or if he read it somewhere else. Needless to say, I've heard this excuse plenty of times since then.  In reality, what you genuinely understand, you can usually reconfigure in almost any sort of shape, leaving some structural essence inside; but when you don't understand flight, you suppose that a flying machine needs feathers, because you can't imagine departing from the analogy of a bird.

So Eliezer2002 is still, in a sense, attached to humanish mind designs—he imagines improving on them, but the human architecture is still in some sense his point of departure.

What is it that finally breaks this attachment?

It's an embarrassing confession:  It came from a science-fiction story I was trying to write.  (No, you can't see it; it's not done.) The story involved a non-cognitive non-evolutionary optimization process; something like an Outcome Pump. Not intelligence, but a cross-temporal physical effect—that is, I was imagining it as a physical effect—that narrowly constrained the space of possible outcomes.  (I can't tell you any more than that; it would be a spoiler, if I ever finished the story.  Just see the post on Outcome Pumps.) It was "just a story", and so I was free to play with the idea and elaborate it out logically:  C was constrained to happen, therefore B (in the past) was constrained to happen, therefore A (which led to B) was constrained to happen.

Drawing a line through one point is generally held to be dangerous. Two points make a dichotomy; you imagine them opposed to one another. But when you've got three different points—that's when you're forced to wake up and generalize.

Now I had three points:  Human intelligence, natural selection, and my fictional plot device.

And so that was the point at which I generalized the notion of an optimization process, of a process that squeezes the future into a narrow region of the possible.

This may seem like an obvious point, if you've been following Overcoming Bias this whole time; but if you look at Shane Legg's collection of 71 definitions of intelligence, you'll see that "squeezing the future into a constrained region" is a less obvious reply than it seems.

Many of the definitions of "intelligence" by AI researchers, do talk about "solving problems" or "achieving goals".  But from the viewpoint of past Eliezers, at least, it is only hindsight that makes this the same thing as "squeezing the future".

A goal is a mentalistic object; electrons have no goals, and solve no problems either.  When a human imagines a goal, they imagine an agent imbued with wanting-ness—it's still empathic language.

You can espouse the notion that intelligence is about "achieving goals"—and then turn right around and argue about whether some "goals" are better than others—or talk about the wisdom required to judge between goals themselves—or talk about a system deliberately modifying its goals—or talk about the free will needed to choose plans that achieve goals—or talk about an AI realizing that its goals aren't what the programmers really meant to ask for.  If you imagine something that squeezes the future into a narrow region of the possible, like an Outcome Pump, those seemingly sensible statements somehow don't translate.

So for me at least, seeing through the word "mind", to a physical process that would, just by naturally running, just by obeying the laws of physics, end up squeezing its future into a narrow region, was a naturalistic enlightenment over and above the notion of an agent trying to achieve its goals.

It was like falling out of a deep pit, falling into the ordinary world, strained cognitive tensions relaxing into unforced simplicity, confusion turning to smoke and drifting away.  I saw the work performed by intelligence; smart was no longer a property, but an engine.  Like a knot in time, echoing the outer part of the universe in the inner part, and thereby steering it.  I even saw, in a flash of the same enlightenment, that a mind had to output waste heat in order to obey the laws of thermodynamics.

Previously, Eliezer2001 had talked about Friendly AI as something you should do just to be sure—if you didn't know whether AI design X was going to be Friendly, then you really ought to go with AI design Y that you did know would be Friendly.  But Eliezer2001 didn't think he knew whether you could actually have a superintelligence that turned its future light cone into paperclips.

Now, though, I could see it—the pulse of the optimization process, sensory information surging in, motor instructions surging out, steering the future.  In the middle, the model that linked up possible actions to possible outcomes, and the utility function over the outcomes.  Put in the corresponding utility function, and the result would be an optimizer that would steer the future anywhere.

Up until that point, I'd never quite admitted to myself that Eliezer1997's AI goal system design would definitely, no two ways about it, pointlessly wipe out the human species.  Now, however, I looked back, and I could finally see what my old design really did, to the extent it was coherent enough to be talked about.  Roughly, it would have converted its future light cone into generic tools—computers without programs to run, stored energy without a use...

...how on Earth had I, the fine and practiced rationalist, how on Earth had I managed to miss something that obvious, for six damned years?

That was the point at which I awoke clear-headed, and remembered; and thought, with a certain amount of embarrassment:  I've been stupid.

To be continued.

New Comment
47 comments, sorted by Click to highlight new comments since:
[-]J3150

I guess these "how stupid I have been" posts are a welcome change to the "how smart I am" posts.

rolls eyes

[+]trlkly-80

I've been stupid.

More generally, I'd like to see Overcoming Bias bloggers writing more about their current biases, either ones they struggle against, though not always successfully; or ones they have decided to surrender to.

[+]Roko-90

Roko:

So allow me to object: not all configurations of matter worthy of the name "mind" are optimization processes. For example, my mind doesn't implement an optimization process as you have described it here.

I would actually say the opposite: Not all optimisation processes are worthy of the name "mind". Furthermore, your mind (I hope!) does indeed try to direct the future into certain limited supersets which you prefer. Unfortunately, you haven't actually said why you object to these things.

My problem with this post is simply that, well... I don't see what the big deal is. Maybe this is because I've always thought about AI problems in terms of equations and algorithms.

So do you now think that engineers can create a "Completely Alien Mind Design"? Do you have a feasible CAMD yourself?

I don't know if Eliezer2002 invented this reply on his own, or if he read it somewhere else. What about the concept of "optimization process"? Did you come to that idea yourself, or read about it elsewhere?

Writing fiction is a really useful tool for biting philosophical bullets. You can consider taboo things in a way your brain considers "safe", because it's just fiction, after all.

Eliezer, if you have time writing your book, one thing I'd really like to see is some sort of "Poor Richard's Almanack" style terse list of rationalist aphorisms. You've generated many, but have you collected them?

People could memorize them like SF geeks memorize the "litany against fear" from Dune ;-)

I guess these "how stupid I have been" posts are a welcome change to the "how smart I am" posts. I personally find the "how stupid I have been" posts useful because they demonstrate one path from stupid to smart, which is useful when knowing that I will probably run into similar realizations in the future. But I learn a lot more from the "how smart I am" posts because.. well, I'm not going to learn much by seeing that someone else made mistakes similar to the ones I used to (or still do) make, without seeing what they do about instead. This post wouldn't mean much to me without having actually learned about optimization processes, or knowing what the Outcome Pump was, etc. Like Eliezer said - "This may seem like an obvious point, if you've been following Overcoming Bias this whole time..." In other words - if you haven't read all the "how smart I am" posts, the "how stupid I have been posts" won't be nearly as useful.

That said... I do find myself in more suspense waiting for the next post in this series than the average post, though I suspect that's due more to the story-like nature of it than the actual material. And really, I don't know that I can say I look forward to the next installment in this series all that much more than posts in other long series like the quantum series or the series on words.

Bertrand Russell felt that such thought processes are native to humans:

What a man believes upon grossly insufficient evidence is an index into his desires -- desires of which he himself is often unconscious. If a man is offered a fact which goes against his instincts, he will scrutinize it closely, and unless the evidence is overwhelming, he will refuse to believe it. If, on the other hand, he is offered something which affords a reason for acting in accordance to his instincts, he will accept it even on the slightest evidence. The origin of myths is explained in this way.

Perhaps any reasoning one readily accepts is evidence of bias, and bears deeper examination. Could this be the value of educated criticism, the willingness of others to "give it to me straight", the impetus to fight against the unconscious tendencies of intelligence?

Roko, will you please exhibit a mind that you believe is not a utility maximizer? I am having trouble imagining one. For example, I consider a mind that maximizes the probability of some condition X coming to pass. Well, that is a utility maximizer in which possible futures satifying condition X have utility 1 whereas the other possible futures have utility 0. I consider a mind produced by natural selection, e.g., a mammalian mind or a human mind. Well, I see no reason to believe that that mind is not a utility maximizer with a complicated utility function that no one can describe completely, which to me is a different statement than saying the function does not exist.

[-]Roko00

Shane: Furthermore, your mind (I hope!) does indeed try to direct the future into certain limited supersets which you prefer.

Yes, it does. But I think we have to distinguish between "an agent who sometimes acts so as to produce a future possible world which is in a certain subset of possible states" and "an agent who has a utility function and who acts as an expected utility maximizer with respect to that utility function". The former is applicable to any intelligent agent, the latter is not. Yes, I am aware of the expected utility theorem of von Neumann and Morgenstern, but I think that decision theory over a fixed set of possible world states and a fixed language for describing properties of those states is not applicable to a situations where, due to increasing intelligence, that fixed set of states quickly becomes outmoded. But this really deserves a good, thorough post of it's own, but you can get some idea of what I am trying to say by reading ontologies, approximations and fundamentalists

Unfortunately, you haven't actually said why you object to these things.

So, my first objection, stated more clearly, says that we can usefully consider agents who are not expected utility maximizers. Clearly there are agents who aren't expected utility maximizers. It strikes me as dangerous to commit to building a superintelligent utility maximizer right now. I have my reasons for not liking utility maximizing agents; other people have their reasons for liking them, but at least let us keep the options open.

My second objection requires no further justification, and my third is really the same as the above: let us keep our options a bit more open.

[-]Roko00

Richard: Roko, will you please exhibit a mind that you believe is not a utility maximizer?

Consider the following toy universe U, which has 2 possible states - A and B, and where time is indexed by the natural numbers. The following pseudo-code does not embody a utility maximizing agent:

10: motor-output{A} 20: motor-output{B} 30: GOTO{10}

The following agent

10: motor-output{A} 20: END

does embody a utility maximizer with utility function U(A) = 1, U(B) = 0

Roko, why not:

U( alternating A and B states ) = 1 U( everything else ) = 0

"I've been stupid".

Come now. It's fine to realise you've made a mistake. But in itself this does not make you as smart as a Protector.

I guess these "how stupid I have been" posts are a welcome change to the "how smart I am" posts.
They're just another variation on that theme. The underlying assumption is that we can learn more from Eliezer's old mistakes, than from anyone else's current thinking.

Shane: "Roko, why not"

Let make Shane's reply more formal, so that Roko has something concrete to attack.

I did not have time to learn how to indent things on this blog, so I use braces to indicate indentation and semicolon to indicate the start of a new line.

Let state be a vector of length n such that for every integer time, (state[time] == A) or (state[time] == B).

U(state) == (sum as i goes from 0 to n in steps of 2) {2 if (state[i] == B) and (state[i+1] == A); 1 if (state[i] == B) or (state[i+1] == A); 0 otherwise}

If I remember correctly this mostly happened about a year earlier. I remember my intense relief when CAFAI came out in late 2000 or early 2001 in any event.

Eliezer, why do you call this awakening "naturalistic"? I don't see where your previous view was not "naturalistic".

I saw a dichotomy between them, the blindness of natural selection and the lookahead of intelligent foresight, [...] yet it was natural selection that created human intelligence, so that our brains, though not our thoughts, are entirely made according to the signature of natural selection.

Humans are the product of choices by intelligent agents. It would indeed be a shattering insight to discover that "blind" forces forged humanity - but that's not how it happened, the agents responsible posessed both vision and foresight - and were not "blind" in any reasonable sense of the word. See: http://alife.co.uk/essays/evolution_sees/

it drives me up the wall when people lump together natural selection and intelligence-driven processes as "evolutionary"

That's perfectly correct, according to the definition of evolution. Evolution is about variation and selection in populations of entities. There is no specification that variation should be random - or that selection should be unthinking. Evolution thus includes intelligent design among its fundamental mechanisms, by its very definition. For example, genetic engineering is a type of evolution. Check any evolution textbook for the definition of evolution.

@ shane: I was specifically talking about utility functions from the set of states of the universe to the reals, not from spacetime histories. Using the latter notion, trivially every agent is a utility maximizer, because there is a canonical embedding of any set X (in this case the set of action-perception pair sequences) into the set of functions from X to R. I'm attacking the former notion - where the domain of the utility function is the set of states of the universe.

Roko,

Who advocates that? Standard frameworks talk about world-histories, e.g. Omohundro's paper, which you use a lot. A hedonistic utilitarian wouldn't value a single state (a snapshot in time) of the universe, since experience of pleasure and the like are processes that take time to occur.

Nontransitive preferences don't translate to a utility function, and it would seem that a mind can have nontransitive preferences. Therefore, not all minds are utility maximizers.

(Does that make sense?)

@ carl: perhaps I should have checked through the literature more carefully. Can you point me to any other references on ethics using world-history utility functions with domain {world histories} ?

Roko, not all minds are good optimizers, not everything with a mind has everything within it proceeding according to optimization, and humans in particular are hacks. I think it should be obvious that I do not regard humans as expected utility maximizers, all bloggings considered, and I've written before about structurally difficult Friendly AI problems that are hard to interpret in terms of EU maximizing (e.g. "preserve individual self-determination"). Still, the insight is the insight.

Come now. It's fine to realise you've made a mistake. But in itself this does not make you as smart as a Protector.

If I'm as smart as Larry Niven, I'm as smart as a Protector. Vinge's Law: No character can be realistically depicted as being qualitatively smarter than the author. Next you'll be telling me that I'm not as smart as that over-enthusiastic child, Kimball Kinnison.

Eliezer,

What did you think of Blindsight (Peter Watts)? Pretty much the entire book is a depiction of humans or aliens much smarter than the author. (Myself, I enjoyed the story quite a bit, but wasn't impressed by the philosophizing about consciousness, which was rather trite and rang true not at all.)

Expected utility maximization is a powerful framework for modelling all intelligent agents - including humans.

Expected utility maximization as a framework is about as powerful as folk psychology. They both break down when you actually need to understand the inner workings of an intelligent agent, AI or human.

It seems a bit like complaining that microeconomics breaks down at the cellular level. Uh huh, but that's not the level at which microeconomics is intended to act as an explanatory framework.

Julian: "[O]ne thing I'd really like to see is some sort of 'Poor Richard's Almanack' style terse list of rationalist aphorisms. You've generated many, but have you collected them? [new graf] People could memorize them [...]"

Aren't the "Twelve Virtues" good enough?

[-][anonymous]00

EY kid,

I know you're deleting my posts, but you will see it first.

The actual true definition of 'intelligence' is the ability to form effective representations of the environment. My definition encompasses yours, but my definition is more general.

Don't say I didn't tell you so...

"It seems a bit like complaining that microeconomics breaks down at the cellular level. Uh huh, but that's not the level at which microeconomics is intended to act as an explanatory framework."

All the issues of how future AIs will actually perform in the real world depend on how far they diverge from utility maximizers. If they don't you'll get paper clippers, if they do they'll be more error prone human like and less likely to hard take off (due to error prone-ness).

Your comment struck me, as someone interested in the nuts and bolts of AI and also the future of the world, as someone saying to a bunch of quantum physicists, "Newtonian Dynamics is a really powerful framework". Which it is, but not a useful statement to make to a few quantum physicists. As most people, at the moment, are interested in prediction of divergence from utility maximizing and creation of AI, your statement was also not so helpful to the general discussion of intelligent agents, IMO.

All the issues of how future AIs will actually perform in the real world depend on how far they diverge from utility maximizers.

That seems highly inaccurate to me. AIs will more closely approximate rational utilitarian agents than current organisms - so the expected utility maximisation framework will become a better predictor of behaviour as time passes.

Obviously, the utility function of AIs will not be to produce paper clips.

Roko: Well, my thesis would be a start :-) Indeed, pick up any text book or research paper on reinforcement learning to see examples of utility being defined over histories.

If I'm as smart as Larry Niven, I'm as smart as a Protector. Vinge's Law: No character can be realistically depicted as being qualitatively smarter than the author.

Ridiculous. Is it not written "Anyone can find the right answer in thirty years"? Well then, anyone can find the right answer in the time it takes to write a book, and then portray a smarter character finding it in two seconds, sufficiently fast to kill their enemies.

Rolf, that was Niven's claim, but that seems to me as weak a form of faked genius as having the character invent neat gadgets. I'm not going to get an aura of scary formidability off that character.

And yet you were apparently impressed by a man getting the right answer to difficult problems in a book, where he had plenty of time to think about it. Had you read about someone performing such feats of mathematical rigour in mere minutes, would that not impress?

[i]That seems highly inaccurate to me. AIs will more closely approximate rational utilitarian agents than current organisms - so the expected utility maximisation framework will become a better predictor of behaviour as time passes.[/i]

The AI that I think humanity is likely to produce first will be a mass of hacks that work, that also hacks itself in a manner that works. There will be masses of legacy code, that it finds hard to get rid of, much as humans find ideas we have relied upon for reasoning for a long time hard to get rid of, if we can at all.

This isn't based on the fact that I think that we should build human like machines. But that only those can win in the the real world. There is no neat clean way of specifying a utility maximizer that eventually always wins, without infinite computing resources and supposing the computation done has no affect on the outside world. So we and other intelligent agents have to take mental short cuts, guess, make mistakes, get stuck in psychological cul-de-sacs. While AIs might up the number of ideas they play with to avoid those traps, it would be a trade off with looking at the links between ideas more thoroughly. For example you could devote more memory and processing time to finding cross correlations between inputs 1 - 1 million and the acquisition of utility, and looking at inputs 2million to 4 million as well. Either could be the right thing to do, so another hack is needed to decide which is done.

Unless you decide to rigorously prove which is the right thing to do, but then you are using up precious processing time and resources doing that. In short I see hacks everywhere in the future, especially towards the beginning, unless you can untangle the recursive knot caused by asking the question, "How much resources should I use, deciding how much resources I should use".

[i]Obviously, the utility function of AIs will not be to produce paper clips.[/i]

And obviously, I was referring to the single minded, focussed utility maximizer that Eliezer often uses in his discussions about AI.

The idea that superintelligences will more closely approximate rational utilitarian agents than current organisms is based on the idea that they will be more rational, suffer from fewer resource constraints, and be less prone to problems that cause them to pointlessly burn through their own resources. They will improve in these respects as time passes. Of course they will still use heuristics - nobody claimed otherwise.

I was referring to the single minded, focussed utility maximizer that Eliezer often uses in his discussions about AI.

This still sounds needlessly derogatory. Paper-clip maximisers have a dumb utility function, that's all. An expected utility maximiser is not necessarily "single minded": e.g. it may be able to focus on many things at once.

Optimisation is key to understanding intelligence. Criticising optimisers is criticising all intelligent agents. I don't see much point to doing that.

Antropic principle is also optimization process, different from evolution and human mind.

Also collective unconcsiones mind of human population which has created for example languages - is also some kind of optimization process.

Also we shoud mention science.

So in fact we live inside many mind-like processes - and often even do not mention it.

"The idea that superintelligences will more closely approximate rational utilitarian agents than current organisms is based on the idea that they will be more rational, suffer from fewer resource constraints, and be less prone to problems that cause them to pointlessly burn through their own resources."

But this in my book is not a firm basis. A firm basis requires a theory of AI. Only then can you talk about whether they will pointlessly burn through resources or not.

There are lots of things that are not obviously pointlessly burning through resources, but might still be doing so. Things like trying to prove P == NP if this happens to be improvable either way, modelling the future of the earth's climate without being able to take into account new anthropogenic influences (such as the change in albedo of solar cells, arcologies, or even unfolding nanoblooms).

Even spending time on this blog might be burning brain and computer cycles, perhaps we should be laying down our thoughts we want to survive in stone, in case we bomb ourselves back into the stone age.

A firm basis requires a theory of AI. Only then can you talk about whether they will pointlessly burn through resources or not.

We have the theory of evolution, we have hundreds of years of man-machine symbiosis to work from - and AI is probably now no longer terribly far off. IMHO, we have enough information to address this issue. Irrational AIs that run about in circles will sell poorly - so we probably won't build many like that.

It'd be interesting to encounter a derelict region of a galaxy where an AI had run its course on the available matter shortly before, finally, harvesting itself into the ingredients for the last handful of tools. Kind of like the Heechee stories, only with so little evidence of what had made it come to exist or why these artifacts had been produced.

Well, after it harvested itself, then the place is safe, so whichever alien race finds it - Bonanza!

I believe you have a problem with transparency here. You did not adequately link your revelation with the refutation of your previous thoughts. It may seem obvious to you that "squeezing the future into a narrow region" means that your old ideas "would have converted its future light cone into generic tools."

And, no, I do not think it is reasonable to ask me to read everything else you've ever said on this blog just to figure out the answer. Perhaps the explanation is too long for this post, but I would at least like some links.

When you say “a science-fiction story”, I am curious if it ever was finished. Is it HPMOR?