The "Outside the Box" Box

Eliezer Yudkowsky

The "Outside the Box" Box

3 min read12th Oct 200751 comments

89

Cached ThoughtsMachine Learning (ML)Social & Cultural Dynamics

Whenever someone exhorts you to "think outside the box", they usually, for your convenience, point out exactly where "outside the box" is located. Isn't it funny how nonconformists all dress the same...

In Artificial Intelligence, everyone outside the field has a cached result for brilliant new revolutionary AI idea—neural networks, which work just like the human brain! New AI Idea: complete the pattern: "Logical AIs, despite all the big promises, have failed to provide real intelligence for decades—what we need are neural networks!"

This cached thought has been around for three decades. Still no general intelligence. But, somehow, everyone outside the field knows that neural networks are the Dominant-Paradigm-Overthrowing New Idea, ever since backpropagation was invented in the 1970s. Talk about your aging hippies.

Nonconformist images, by their nature, permit no departure from the norm. If you don't wear black, how will people know you're a tortured artist? How will people recognize uniqueness if you don't fit the standard pattern for what uniqueness is supposed to look like? How will anyone recognize you've got a revolutionary AI concept, if it's not about neural networks?

Another example of the same trope is "subversive" literature, all of which sounds the same, backed up by a tiny defiant league of rebels who control the entire English Department. As Anonymous asks on Scott Aaronson's blog:

"Has any of the subversive literature you've read caused you to modify any of your political views?"

Or as Lizard observes:

"Revolution has already been televised. Revolution has been *merchandised*. Revolution is a commodity, a packaged lifestyle, available at your local mall. $19.95 gets you the black mask, the spray can, the "Crush the Fascists" protest sign, and access to your blog where you can write about the police brutality you suffered when you chained yourself to a fire hydrant. Capitalism has learned how to sell anti-capitalism."

Many in Silicon Valley have observed that the vast majority of venture capitalists at any given time are all chasing the same Revolutionary Innovation, and it's the Revolutionary Innovation that IPO'd six months ago. This is an especially crushing observation in venture capital, because there's a direct economic motive to not follow the herd—either someone else is also developing the product, or someone else is bidding too much for the startup. Steve Jurvetson once told me that at Draper Fisher Jurvetson, only two partners need to agree in order to fund any startup up to $1.5 million. And if all the partners agree that something sounds like a good idea, they won't do it. If only grant committees were this sane.

The problem with originality is that you actually have to think in order to attain it, instead of letting your brain complete the pattern. There is no conveniently labeled "Outside the Box" to which you can immediately run off. There's an almost Zen-like quality to it—like the way you can't teach satori in words because satori is the experience of words failing you. The more you try to follow the Zen Master's instructions in words, the further you are from attaining an empty mind.

There is a reason, I think, why people do not attain novelty by striving for it. Properties like truth or good design are independent of novelty: 2 + 2 = 4, yes, really, even though this is what everyone else thinks too. People who strive to discover truth or to invent good designs, may in the course of time attain creativity. Not every change is an improvement, but every improvement is a change.

Every improvement is a change, but not every change is an improvement. The one who says, "I want to build an original mousetrap!", and not, "I want to build an optimal mousetrap!", nearly always wishes to be perceived as original. "Originality" in this sense is inherently social, because it can only be determined by comparison to other people. So their brain simply completes the standard pattern for what is perceived as "original", and their friends nod in agreement and say it is subversive.

Business books always tell you, for your convenience, where your cheese has been moved to. Otherwise the readers would be left around saying, "Where is this 'Outside the Box' I'm supposed to go?"

Actually thinking, like satori, is a wordless act of mind.

The eminent philosophers of Monty Python said it best of all.

Cached ThoughtsMachine Learning (ML)Social & Cultural Dynamics

Frontpage

89

Mentioned in

358Intellectual Hipsters and Meta-Contrarianism

165Some cruxes on impactful alternatives to AI policy work

131Competent Elites

108How to Seem (and Be) Deep

76Stop Voting For Nincompoops

Load More (5/10)

New Comment

51 comments, sorted by

oldest

Click to highlight new comments since: Today at 6:19 AM

[-]Robin_Hanson217y160

Eliezer is right, as usual. But it raises the question: when should you be flattered and when should you be insulted to be called "creative" or "revolutionary"?

[-]Eliezer Yudkowsky17y370

One bad algorithm I can think of is to be flattered when you're called such by other people you think are currently "creative" or "revolutionary", as opposed to people who were previously revolutionary and now mainstream. The former is how cliques form.

This as a second thought to my first reaction, which was, "Well, if Robin Hanson calls you "revolutionary" you must practically be insane."

[-]TheOtherDave14y180

Beats being _im_practically insane, I suppose.

[-]Nick_Tarleton17y60

Trying to be original may be justifiable if people will buy a NEW!! product even if it's inferior.

I appreciated your choice of examples. Conformist-nonconformism is about the most annoying thing in the world to me, in addition to making a lot of smart people useless (or worse).

[-]Aaron317y160

Eliezer is certainly correct that our real goal is to make optimal decisions and perform optimal actions, regardless of how different they are from those of the herd. But that doesn't mean we should ignore information about our conformity or non-conformity. It's often important.

Consider the hawk-dove game. If you're in a group of animals who randomly bump into each other and compete for territory, the minority strategy is the optimal strategy. If all your peers are cowards, you can completely dominate them by showing some fang. Or if your peers follow the "never back down, always fight to the death" strategy, you should be a coward until they've killed each other off. Non-conformity is a valid goal (or subgoal, at least).

On the other hand, in situations with networks effects, you want to be a conformist. If you're selling your widget on Bob's Auction Site, which has 20 users, instead of eBay, your originality is simply stupid.

[-]Robin_Hanson217y40

Eliezer, your first and second thoughts illustrate my question; they are not clearly positive or negative descriptors. :)

[-]MichaelAnissimov17y20

Confession: I watched the Monty Python clip before reading the whole post.

[-]TGGP417y10

Much of what Eliezer talked about in the beginning is discussed in The Rebel Sell. I am actually not as disturbed by those of the "radical counterculture" as the authors, who discuss how to accomplish change as opposed to receiving recognition, because they know enough to be dangerous.

[-]Katja17y120

Boxes are always patterns completed by brains, along with ready made outsides of them. Thinking is necessary because to find the outside of a box you have to notice the box is there, which you don't if your brain fills it in automatically. Things are less noticable if you can't concieve of the possibility of an alternative to them.

I probably think this because my brain fills in this pattern. And I only think that (and this) because the idea of recursion is another pattern my brain enjoys filling in. An effective way to simulate originality though: actively fill in the wrong patterns. Choose an automatic response from another set of ideas. Babies are being sold on the black market? don't automatically intone 'the police should stop that', say 'how inefficient - it should be a legal market'. If someone says we will all be dead one day, instead of reflecting on the meaning this gives to your life, politely point out that they have their statistics wrong; about 5% of people have never died, and it correlates well with those born recently. Depending on your comparative preferences for perceived originality and truth, this can be done to convince most people you are insane and possibly completely immoral: nice socially recognisable signals that you are being original without having to conform to current originality.

[-]Senthil17y20

Actually thinking, like satori, is a wordless act of mind.

Is such an act possible?

Wittgenstein said that 'Philosophy is a battle against the bewitchment of our intelligence by means of language.'. I guess 'thinking' can take the place of 'philosophy' in what he said. If seen this way, the act involves a lot of struggle. Even if we do away with words it seems like something else should take its place against which we would have to battle. Or maybe, I'm thinking a lot inside the box :)

[-]ArthurHung17y10

The video at the end makes such nice closure to such a great post. Great taste. I am reminded of Jiddu Krishnamurti.

Eliezer's post here, if I am correct, is meant to make her readers question themselves if they are truly being original or if they are simply following the "other" masses. But a question, here: how many people think, actually think in their daily lives? And by "think", I mean produce truly original thought--impossible without some sort of muse. That's my current hypothesis. Going by that line of reasoning, therefore perhaps truly original thought can only be realized/ created through a true expression of the self through one's ideal (or at least some very synergetic) medium. Perhaps it is only people who reach their true potential/ at the last stage of Maslow's hierarchy who achieve original thought.

We can see from history this amounts to approximately 1% of the population: Da Vinci etc. As individuals then, perhaps the only way we can truly see more original thought from the people around us is to become an original thinker ourselves, to bring ourselves up to that level of so called "genius", which is simply produced by a persistent focus of purpose and passion to get up to that level. Then by changing ourselves, we naturally inspiring others--simply by being ourselves. A wonderful thing.

Of course, this is simply speculation as I'm not at a Da Vinci/Freud/Nietzsche/Krishnamurti (prolific original output) like level--though that is my major life purpose.

[-]TGGP417y00

I would guess that it is not a state a person has to be in to come up with an original thought but a situation in which unoriginal thoughts seem obviously inapplicable to them. You can't assume because someone produced some great thought they are a separate class of person and will continue to do so. A lot of the things Lord Kelvin said about science near the end of his life seem downright silly today.

Also, Eliezer is not a "her". His wikipedia page has a picture of him, beard and all.

[-]Ed316y20

"Whenever someone exhorts you to "think outside the box", they usually, for your convenience, point out exactly where "outside the box" is located. Isn't it funny how nonconformists all dress the same..."

They do? Can you give an example? I can't recall anybody ever pointing out a location.

And NNs are independent of "general intelligence". NNs are being used to great success in many fields today. The fact that we don't have hard AI is no condemnation of NNs, nor a problem with the phrase "think outside the box". That's quite a leap you made, and I've only read 2 paragraphs so far!

[-]benny16y20

Nassim Nicholas Taleb said at one point that his next book will be about tinkering - how many discoveries were made while the researcher was seeking something else. So directed research is good because it provides an excuse to "tinker", to spot the unexpected and go off on a tangent.

Have you spoken to Taleb? Seems there's lots of common ground. He likes to learn directly from people what's happening.

[-]RobinZ15y20

P.S. The YouTube video embedded in the post has been removed. One place where the same excerpt appears is here.

Edit: Possibly better, from the Monty Python channel.

[-]MrHen14y260

The way I would word this: The box exists in the map, not the territory. Looking "outside of the box" is still looking at the map.

[-]RobinZ14y10

That is a clever way of putting it!

[-]Minds_Eye9y50

You could always tell them to think inside the chimney. If you're lucky they'll be so confused they'll look at the territory to figure out what you mean, and if you’re really lucky they'll end up thinking downstairs in the attic and never bother you again.

[-]Repenexus4y10

I would say that the box does exist as territory, as the realm of cached human thoughts. However, what most of us perceives as 'outside the box' in our maps is, in reality, 'inside the box' in territory.

[-]thomblake14y00

This cached thought has been around for three decades. Still no general intelligence. But, somehow, everyone outside the field knows that neural networks are the Dominant-Paradigm-Overthrowing New Idea, ever since backpropagation was invented in the 1970s.

It's been going strong in one form or another since the late nineteenth century. William James was a notable supporter of the notion that the human brain had emergent behavior based on the interaction of many simple units, and from this culture came the term "connectionism" that was popular amongst AI speculators Before the War.

[-]lutorm14y20

"And if all the partners agree that something sounds like a good idea, they won't do it. If only grant committees were this sane."

but then you say:

"Properties like truth or good design are independent of novelty: 2 + 2 = 4, yes, really, even though this is what everyone else thinks too."

In venture capital it may pay off to avoid doing what every one else does. But in funding grants, it seems there's no advantage to that. It's not like the science get devalued if it's discovered twice. If everyone thinks it's a good grant, then maybe it just is?

[-]Kingreaper14y20

It's not like the science get devalued if it's discovered twice

If the knowledge discovered has a value X, then discovering it twice gives the discovery an average value X/2, and discovering it thrice gives the discovery an average value X/3.

This is of course a simplification, because the confirmation received from having multiple copies of the discovery is itself of some value, which flattens the value curve; however the value of a confirmation decreases with each confirmation already extant.

[-]FeepingCreature13y130

The eminent philosophers of Monty Python said it best of all:

This video is no longer available because the uploader has closed their YouTube account.

Deep, man.

[-]aausch13y40

The monty python link is stale

[-]Vladimir_Nesov13y30

Fixed.

[-]Hughdo12y10

I like how these serious logical and moral discussions are juxtaposed with Monty Python.

[This comment is no longer endorsed by its author]Reply

[-]fractalman11y20

And now I have the urge to build a mousetrap out of as many lasers and rocket launchers as I can get my hands on...which is not, of course, the least bit optimal for the purpose of catching mice.

[-]roland6y00

Lucifer's version

[-]Luke Allen4y10

I remember the late 90's, when I first gained access to the Internet. Here were my people, people who enjoy thinking, minds communicating at a bare-metal level about interesting and smart things.

It was around that time I ran across the concept of a "free-thinker" and started mulling over that label in my mind. It sounded like a compliment, something I'd like if people started calling me that. After all, I don't think the way other people do (thanks, autism!), and I had always felt like a mind trapped in a body. But the first time I brought up being a free-thinker was in a discussion about religion with an Internet Atheist. I was promptly and patronizingly informed that I couldn't possibly be a free-thinker because I believe in God.

Oh.

Free-thinker = atheist, apparently. A one-to-one correspondence, a synonym, and a hope for esteem from my peers crushed.

Never mind that I treat the Bible and young-Earth creationism as seriously and geekily as I treat the canons of the various Star Trek series. Never mind that I try to get past the rah-rah-our-team side of religion to follow Jesus' commands to love each other with radical, boundary-breaking see-from-their-eyes empathy. Never mind that I'd been hurt by church hypocrisy as any former-Catholic or raised-Baptist Internet Atheist among my circle of friends.

No, this badge of uniqueness was not for me. I was too unique for it.

[-]Teerth Aloke4y10

And now? Do you still believe in an all-powerful creator? (Not that I have any problem with that)

[-]Luke Allen4y10

Yes, and still a young-Earth creationist too. On here I'd probably clarify my concept of omnipotency as "axiomatic ultra-ability", more similar to a programmer of a simulation than a lightning-tosser in a cloud-chariot in the sky.

As a geek-for-life and dedicated devourer of SF, I compare and contrast the details of what I believe with all the god-fictions out there, from Aslan and Eru Ilúvatar to Star Trek's Q and The Prophets, to the God and Satan of Heinlein's Job, to the Anu/Padomay duality at the core of Elder Scrolls lore and the consequent universe literally built out of politics and necromancy. Recently, reading the SSC classic blog post "Meditations on Moloch" helped me coalesce an idea that had been bouncing around my head for twenty years about the "weakling, uncaring opposite of God, waiting with an open mouth at the bottom of the slide."

I just wanted to find a community of experimental theologists who were as willing as I am to ask these questions and posit potentially heretical theories during the process of trying to better model God in our words and minds. Apparently I'm missing an absurdity heuristic that keeps more people from being like me.

[-]Repenexus4y10

So, essentially, the way forward is to attempt to make something 'good' rather than something 'original'. Because of cached thoughts leading all forced 'original' thoughts into truly unoriginal thoughts, the only way to make something truly 'original' is to make something new, not through the attempt of making something new (which would lead you in circles), but to make something better than the rest. By trying to make something better than the rest, it has to be markedly different from everything else.

[-]the gears to ascension2y280

This intro aged very very poorly. I suspect that the core point of this article may be much weaker than originally claimed because of it. simply constraining your thinking to think outside the box, but then reaching immediately outside the box and not reaching further, is likely a reasoning error. but constraining your thinking to outside the box, then reaching for what is immediately outside it, made Eliezer pick up neural networks. which he then immediately dismissed as not likely to work because so many people had done this. he managed to not see AlphaGo coming, and I have always suspected it was as a result of this article's point in particular that the AI safety crowd were blindsided by neural networks. I think this is a pretty severe prediction error and that this post is likely an incorrect point because of it. interesting disagreement about how to interpret this historical information would be quite welcome.

[-]Vaniver2y60

I think it's not the case that "neural networks" as discussed in this post made AlphaGo. That is, almost of the difficulty in making AlphaGo happen was picking which neural network architecture would solve the problem / buying fast enough computers to train it in a reasonable amount of time. A more recent example might be something like "model-based reinforcement learning"; for many years 'everyone knew' that this was the next place to go, while no one could write down an algorithm that actually performed well.

I think the underlying point--if you want to think of new things, you need to think original thoughts instead of signalling "I am not a traditionalist"--is broadly correct even if the example fails.

That said, I agree with you that the example seems unfortunately timed. In 2007, some CNNs had performed well on a handful of tasks; the big wins were still ~4-5 years in the future. If the cached wisdom had been "we need faster computers," I think the cached wisdom would have looked pretty good.

[-]TurnTrout1y140

I worry that this comment dances around the basic update to be made.

Part of this post makes fun of people who were excited about neural networks. Neural network-based approaches have done extremely well. Eliezer's example wasn't just "unfortunately timed." Eliezer was wrong.

(Edited "This post" -> "Part of this post")

[-]Vaniver1y50

I think that's a pretty simplistic view of the post, but given that view, I agree that's the right update to make.

Why does it seem simplistic? Like, one of the central points of the post you link is that we should think about the specific technical features of proposals, instead of focusing on marketing questions of which camp a proposal falls into. And Eliezer saying he's "no fan of neurons" is in the context of him responding to a comment by someone with the username Marvin Minsky defending the book Perceptrons (the post is from the Overcoming Bias era, when comments did not have threading or explicit parents).

I basically read this as Eliezer making fun of low-nuance people, not people excited about NNs; in that very post he excitedly describes a NN-based robotics project!

[-]DirectedEvolution1y130

But that robotics project was viewed by Eliezer as an example of carefully-designed biological imitation in which the mechanism of action was known by the researchers into the deep details. Across multiple posts, Eliezer's views from this time period emphasize that he believes that AGI can only come from a well-understood AI architecture - either a detailed imitation of the brain, or a crafted logic-based approach. This robotics project was an example of the latter, despite the fact that it used neurons.

This robot ran on a "neural network" built by detailed study of biology. The network had twenty neurons or so. Each neuron had a separate name and its own equation. And believe me, the robot's builders knew how that network worked.
Where does that fit into the grand dichotomy? Is it top-down? Is it bottom-up? Calling it "parallel" or "distributed" seems like kind of a silly waste when you've only got 20 neurons - who's going to bother multithreading that?

So this would be, in my view, another clear example of Eliezer being excited about an AI paradigm that ultimately did not lead to the black-box neural network-based LLMs that actually seem to have put us on the path to AGI.

[-]TurnTrout1y20

I think that's a pretty simplistic view of the post

To clarify, I wasn't claiming that the point of this post is to mock neural network proponents. It's not. It's just a few paragraphs of the post. Updated original comment to clarify.

And Eliezer saying he's "no fan of neurons" is in the context of him responding to a comment by someone with the username Marvin Minsky defending the book Perceptrons (the post is from the Overcoming Bias era, when comments did not have threading or explicit parents).

Can you say more why you think that context is relevant? He says "this may be clearer from other posts", which implies to me that his "not being a fan of neurons" is not specific to that specific discussion (since I imagine he wrote those other posts independently of Marvin_Minsky's comment).

(I have more things to say in response to your comment here, but I'd like to hear your answer to the above first!)

[-]Vaniver1y20

Can you say more why you think that context is relevant?

Yeah; from my perspective the main question here is something like "how much nuance does a statement have, and what does that imply about how far you can draw inferences from it?". I think people are often rounding Eliezer off to a simplified model and then judging the simplified model's predictions and then attributing that judgment to Eliezer, in a way that I think is probably inaccurate.

For this particular point, there's also the question of what a "fan of neurons" even is; the sorts you see today are pretty different from the sorts you would see back in 2010, and different from the sort that Marvin Minsky would have seen.

Not as relevant to the narrow point, but worth pointing out somewhere, is that I'm pretty sure that even if Eliezer had been aware of the potential of modern ANNs ahead of time, I think he probably would have filtered that out of his public speech because of concerns about the alignability of those architectures, in a way that makes it not obvious how to count predictions. [Of course he can't get any points for secretly predicting it without hashed comments, but it seems less obvious that he should lose points for not predicting it.]

[-]TurnTrout1y110

Thanks for the additional response. I've thought through the details here as well. I think that the written artifacts he left are not the kinds of writings left by someone who actually thinks neural networks will probably work, capabilities-wise.

As you read through these collected quotes, consider how strongly "he doesn't expect ANNs to work" and "he expects ANNs to work" predict each quote:

In Artificial Intelligence, everyone outside the field has a cached result for brilliant new revolutionary AI idea—neural networks, which work just like the human brain! New AI Idea: complete the pattern: "Logical AIs, despite all the big promises, have failed to provide real intelligence for decades—what we need are neural networks!"
This cached thought has been around for three decades. Still no general intelligence. But, somehow, everyone outside the field knows that neural networks are the Dominant-Paradigm-Overthrowing New Idea, ever since backpropagation was invented in the 1970s. Talk about your aging hippies.
...

I'm no fan of neurons; this may be clearer from other posts
...

But there is just no law which says that if X has property A and Y has property A then X and Y must share any other property. "I built my network, and it's massively parallel and interconnected and complicated, just like the human brain from which intelligence emerges! Behold, now intelligence shall emerge from this neural network as well!" And nothing happens. Why should it?
...

Wasn't it in some sense reasonable to have high hopes of neural networks? After all, they're just like the human brain, which is also massively parallel, distributed, asynchronous, and -
Hold on. Why not analogize to an earthworm's brain, instead of a human's?
A backprop network with sigmoid units... actually doesn't much resemble biology at all. Around as much as a voodoo doll resembles its victim. The surface shape may look vaguely similar in extremely superficial aspects at a first glance. But the interiors and behaviors, and basically the whole thing apart from the surface, are nothing at all alike. All that biological neurons have in common with gradient-optimization ANNs is... the spiderwebby look.
And who says that the spiderwebby look is the important fact about biology? Maybe the performance of biological brains has nothing to do with being made out of neurons, and everything to do with the cumulative selection pressure put into the design.

Do these strike you as things which could plausibly be written by someone who actually anticipated the modern revolution?

there's also the question of what a "fan of neurons" even is; the sorts you see today are pretty different from the sorts you would see back in 2010, and different from the sort that Marvin Minsky would have seen.

If Eliezer wasn't a fan of those particular ANNs, in 2010, because those literal empirically tried setups hadn't yet led to AGI... That's an uninteresting complaint. It's trivial. ANN proponents also wouldn't anticipate AGI from already-tried experiments which had already failed to produce AGI.

The interesting version of the claim is the one which talks about research directions, no? About being excited about neural network research in terms of its future prospects?

I'm pretty sure that even if Eliezer had been aware of the potential of modern ANNs ahead of time, I think he probably would have filtered that out of his public speech because of concerns about the alignability of those architectures

In the world where he was secretly aware, he could have pretended to not expect much of ANNs. In that case, that's dishonest. Also risky, it's possibly safer to just not bring it up and not direct even more attention to the matter. If you think that X is a capabilities hazard, then I think a good rule of thumb is don't talk about X.

So, even privileging this "he secretly knew" hypothesis by considering it explicitly, it isn't predicting observed reality particularly strongly, since "don't talk about it at all" is another reasonable prediction of that hypothesis, and that didn't happen.

in a way that makes it not obvious how to count predictions.

Let's consider what incentives we want to set up. We want people who can predict the future to be recognized and appreciated, and we want people who can't to be taken less seriously in such domains. We do not want predictions to communicate sociohazardous content.

For sociohazards like this, hashed comments should suffice quite well, for this kind of problematic prediction. You can't fake it if you can't predict it in advance. If you can predict it in advance, you can still get credit, without leaking much information.

I am therefore (hopefully predictably) unimpressed by hypotheses around secret correct predictions which clash with his actual public writing, unless he had verifiably contemporary predictions which were secret but correct.

[Of course he can't get any points for secretly predicting it without hashed comments, but it seems less obvious that he should lose points for not predicting it.]

Conservation of expected evidence. If you would have updated upwards on his predictive abilities if he had made hashed comments and then revealed them, then observing not-that makes you update downwards (eta - on average, with a few finicky details here that I think work out to the same overall conclusion; happy to discuss if you want).

[EDITed out a final part for now]

[-]Vaniver1y80

Do these strike you as things which could plausibly be written by someone who actually anticipated the modern revolution?

I do not think I claimed that Eliezer anticipated the modern revolution, and I would not claim that based on those quotes.

The point that I have been attempting to make since here is that 'neural networks_2007', and the 'neural networks_1970s' Eliezer describes in the post, did not point to the modern revolution; in fact other things were necessary. I see your point that this is maybe a research taste question--even if it doesn't point to the right idea directly, does it at least point there indirectly?--to which I think it is evidence against Eliezer's research taste (on what will work, not necessarily on what will be alignable).

[I also have long thought Eliezer's allergy to the word "emergence" is misplaced (and that it's a useful word while thinking about dynamical systems modeling in a reductionistic way, which is a behavior that I think he approves of) while agreeing with him that I'm not optimistic about people whose plan for building intelligence doesn't route thru them understanding what intelligence is and how it works in a pretty deep way.]

Conservation of expected evidence. If you would have updated upwards on his predictive abilities if he had made hashed comments and then revealed them, then observing not-that makes you update downwards (eta - on average, with a few finicky details here that I think work out to the same overall conclusion; happy to discuss if you want).

I agree with regards to Bayesian superintelligences but not bounded agents, mostly because I think this depends on how you do the accounting. Consider the difference between scheme A, where you transfer prediction points from everyone who didn't make a correct prediction to people who did make correct predictions, and scheme B, where you transfer prediction points from people who make incorrect predictions to people who make correct predictions, leaving untouched people who didn't make predictions. On my understanding, things like logical induction and infrabayesianism look more like scheme B.

[-]TurnTrout1y40

I do not think I claimed that Eliezer anticipated the modern revolution, and I would not claim that based on those quotes.
The point that I have been attempting to make since here is that 'neural networks_2007', and the 'neural networks_1970s' Eliezer describes in the post, did not point to the modern revolution; in fact other things were necessary.

I apologize if I have misunderstood your intended point. Thanks for the clarification. I agree with this claim (insofar as I understand what the 2007 landscape looked like, which may be "not much"). I think that the claim is not that interesting, though, but this might be coming down to semantics.

The following is what I perceived us to disagree on, so I'd consider us to be in agreement on the point I originally wanted to discuss:

I see your point that this is maybe a research taste question--even if it doesn't point to the right idea directly, does it at least point there indirectly?--to which I think it is evidence against Eliezer's research taste (on what will work, not necessarily on what will be alignable).

I'm not optimistic about people whose plan for building intelligence doesn't route thru them understanding what intelligence is and how it works in a pretty deep way

Yeah. I think that in a grown-up world, we would do this, and really take our time.

On my understanding, things like logical induction and infrabayesianism look more like scheme B.

Nice, I like this connection. Will think more about this, don't want to hastily unpack my thoughts into a response which isn't true to my intuitions here.

[-]Unnamed1y40

I was recently looking at Yudkowsky's (2008) "Artificial Intelligence as a Positive and
Negative Factor in Global Risk" and came across this passage which seems relevant here:

Friendly AI is not a module you can instantly invent at the exact moment when it is first needed, and then bolt on to an existing, polished design which is otherwise completely unchanged.

The field of AI has techniques, such as neural networks and evolutionary programming, which have grown in power with the slow tweaking of decades. But neural networks are opaque—the user has no idea how the neural net is making its decisions—and cannot easily be rendered unopaque; the people who invented and polished neural networks were not thinking about the long-term problems of Friendly AI. Evolutionary programming (EP) is stochastic, and does not precisely preserve the optimization target in the generated code; EP gives you code that does what you ask, most of the time, under the tested circumstances, but the code may also do something else on the side. EP is a powerful, still maturing technique that is intrinsically unsuited to the demands of Friendly AI. Friendly AI, as I have proposed it, requires repeated cycles of recursive self-improvement that precisely preserve a stable optimization target.

The most powerful current AI techniques, as they were developed and then polished and improved over time, have basic incompatibilities with the requirements of Friendly AI as I currently see them. The Y2K problem—which proved very expensive to fix, though not global-catastrophic—analogously arose from failing to foresee tomorrow’s design requirements. The nightmare scenario is that we find ourselves stuck with a catalog of mature, powerful, publicly available AI techniques which combine to yield non-Friendly AI, but which cannot be used to build Friendly AI without redoing the last three decades of AI work from scratch.

[-]paulfchristiano1y80

If the cached wisdom had been "we need faster computers," I think the cached wisdom would have looked pretty good.

If you think neural networks are like brains, you might think that you would get human-like cognitive abilities at human-like sizes. I think this was a very common view (and it has aged quite well IMO).

[-]Vaniver1y20

Agreed, tho I think Eliezer disagrees?

[-]jacob_cannell1y90

I can't believe that post is sitting at 185 karma considering how it opens with a complete blatant misquote/lie about moravec's central prediction, and only gets worse from there.

Moravec predicted - in mind children in 1988! - AGI in 2028, based on moore's law and the brain reverse engineering assumption. He was prescient - a true prophet/futurist. EY was wrong and his attempt to smear Moravec here is simply embarrassing.

[-]anonymousaisafety1y20

I'm reminded of this thread from 2022: https://www.lesswrong.com/posts/27EznPncmCtnpSojH/link-post-on-deference-and-yudkowsky-s-ai-risk-estimates?commentId=SLjkYtCfddvH9j38T#SLjkYtCfddvH9j38T

[-]Noosphere891y90

Even with some disagreements writ how powerful AI can be, I definitely agreee that Eliezer is pretty bad epistemically speaking on anything related to AI or alignment topics, and we should stop treating him as any kind of authority.

[-]paulfchristiano1y210

I think Eliezer does disagree. I find his disagreement fairly annoying. He calls biological anchors the "trick that never works" and gives an initial example of Moravec predicting AGI in 2010 in the book Mind Children.

But as far as I can tell so far that's just Eliezer putting words in Moravec's mouth. Moravec doesn't make very precise predictions in the book, but the heading of the relevant section is "human equivalence in 40 years" (i.e. 2028, the book was written in 1988). Eliezer thinks that Moravec ought to think that human-level AI and shortly thereafter a singularity will occur at the time when a giant cluster is as big as a brain, which Moravec puts in 2010. But I don't see any evidence that Moravec agreed with that implication, and the book seems to generally talk about a timeframe like 2030-2040. Eliezer repeated this claim in our conversation but still didn't really provide any indication Moravec held this view.

To the extent that people were imagining neural networks, I don't think they would expect trained neural networks to be the size of a computing cluster. It's not not the straightforward extrapolation from the kinds of neural networks people were actually computing, so someone going on vibes wouldn't make that forecast. And if you try to actually pencil out the training cost it's clear it won't work, since you have to run a neural network a huge number of times during training, so someone trying to think things through on paper wouldn't think that either. At least since the 1990 I've seen a lot of people making predictions along these lines, but as far as I can tell they seem to give actual predictions in the 2020s or 2030s which currently look quite good to me relative to every other forecasting methodology.

[-]jacob_cannell1y110

This graph nicely summarizes his timeline from Mind Children in 1988. The book itself presents his view that AI progress is primarily constrained by compute power available to most researchers, which is usually around that of a PC.

Moravec et al were correct in multiple key disagreements with EY et al:

That progress was smooth and predictable from Moore's Law (similar to how the arrival of flight is postdictable from ICE progress)
That AGI would be based on brain-reverse engineering, and thus will be inherently anthropomorphic
That "recursive self-improvement" was mostly relevant only in the larger systemic sense (civilization level)

LLMs are far more anthropomorphic (brain-like) than the fast clean consequential reasoners EY expected:

close correspondence to linguistic cortex (internal computations and training objective)
complete with human-like cognitive biases!
unexpected human-like limitations: struggle with simple tasks like arithmetic, longer term planning, etc
AGI misalignment insights from jungian psychology more effective/useful/popular than MIRI's core research

All of this was predicted from the systems/cybernetic framework/rubric that human minds are software constructs, brains are efficient and tractable, and thus AGI is mostly about reverse engineering the brain and then downloading/distilling human mindware into the new digital substrate.

[-]paulfchristiano1y40

I don't know if the graph settles the question---is Moravec predicting AGI at "Human equivalence in a supercomputer" or "Human equivalence in a personal computer"? Hard to say from the graph.

The fact that he specifically talks about "compute power available to most researchers" makes it more clear what his predictions are. Taken literally that view would suggest something like: a trillion dollar computing budget spread across 10k researchers in 2010 would result in AGI in not-too-long, which looks a bit less plausible as a prediction but not out of the question.

Moderation Log