Today's post, Beware the Unsurprised was originally published on May 3, 2007. A summary (from the LW wiki):

If reality consistently surprises you, then your model needs revision. But beware those who act unsurprised by surprising data. Maybe their model was too vague to be contradicted. Maybe they haven't emotionally grasped the implications of the data. Or maybe they are trying to appear poised in front of others. Respond to surprise by revising your model, not by suppressing your surprise.


Discuss the post here (rather than in the comments of the original post).

This post is part of a series rerunning Eliezer Yudkowsky's old posts so those interested can (re-)read and discuss them. The previous post was Think like Reality, and you can use the sequence_reruns tag or rss feed to follow the rest of the series.

Sequence reruns are a community-driven effort. You can participate by re-reading the sequence post, discussing it, posting the next day's sequence reruns post, summarizing forthcoming articles on the wiki, or creating exercises. Go here for more details, or to discuss the Sequence Reruns.

New Comment
6 comments, sorted by Click to highlight new comments since:

Is there a LW post that explains the basic terminology, like what constitutes a 'model', what models are subject to revision and when to abandon a model completely?

It would help me a lot if someone could write a post explaining the difference between a hypothesis that can be falsified and a prediction. For example, what is the difference between MWI, AI going FOOM and general relativity. As far as I can tell, the first is a consistent explanation of what we know about quantum mechanics, the second a prediction and the third a theory (a collection of various kinds of confirmations of a hypothesis).

If someone predicted that the world is going to end, I understand that this prediction is not falsifiable, because it is too vague. At what point is it rational to demand someone to be more specific? At what point is it rational to ask to make a prediction falsifiable? When is it acceptably to ask someone under what circumstance the person would be able to notice that their belief was false?

Can AI going FOOM be "surprised" and revised, or is the singularity always near? And if AI going FOOM is not subject to empirical criticism, what hinders one to reformulate it as a hypothesis? Would that be a bad idea?

Can AI going FOOM be "surprised" and revised, or is the singularity always near?

ETA: I'm taking "Singularity" to mean "AI hard takeoff followed by the end of the world as we know it," not just "AI" or "AI hard takeoff."

The hypothesis that a Singularity is possible/going to happen predicts the observation of a Singularity under certain conditions, usually either "within a few years or less of the first smarter-than-human AI" or "before some year (I've usually heard 2100). If one or both of those conditions are met and there's still no Singularity, that hypothesis will need to be revised/thrown out.

Also, said hypothesis stems from a model of the world that has certain properties, including: technology is exponentially accelerating, AIs are possible, smarter-than-human self-modifying AIs can or will have a hard takeoff, and there is a major advantage for more intelligent agents over less intelligent ones.

Also, said hypothesis stems from a model of the world that has certain properties, including: ...

  • ...technology is exponentially accelerating...

Eliezer Yudkowsky says that, "Exponentials are Kurzweil's thing. They aren't dangerous."

  • ...AIs are possible...

But does it follow that:

  • ...smarter-than-human self-modifying AIs can or will have a hard takeoff...

Your hypothesis seems to include itself as a premise? Is this correct? I am sorry that I have to ask this, I lack a lot of education :-(

The hypothesis that a Singularity is possible/going to happen predicts the observation of a Singularity under certain conditions...

Yes, I asked if it would be rational to demand the proponents of a Singularity to be more specific by naming some concrete conditions.

..."within a few years or less of the first smarter-than-human AI"...

I am sorry, this sounds a bit like, "the world will end a few years or less after the first antimatter asteroid has been detected to be on a collision course with earth". Maybe it is just my complete lack of training in matters of rationality that makes me think so. I am really sorry in that case :-(

...before some year (I've usually heard 2100)...

Eliezer Yudkowsky says:

John did ask about timescales and my answer was that I had no logical way of knowing the answer to that question and was reluctant to just make one up.

Does this mean that a hypothesis, or prediction, does not need to be specific about its possible timeframe? We just have to wait? At what point do we then decide to turn to other problems? Maybe I am completely confused here, but how do you update your risk estimations if you can't tell when a risk ceases to be imminent?

If one or both of those conditions are met and there's still no Singularity, that hypothesis will need to be revised/thrown out.

Since, as far as I can tell, in your hypothesis, smarter-than-human AI is strongly correlated with the occurrence of a Singularity, would it be reasonable to name some concrete conditions required to enable such a technology?

To be clear, I am just trying to figure out how the proponents of explosive recursive self-improvement can be surprised by data. Maybe this is perfectly clear for everyone else, I am sorry, I don't know where else to ask about this.

Eliezer Yudkowsky says that, "Exponentials are Kurzweil's thing. They aren't dangerous."

Different people who believe in some form of Singularity disagree on the specifics. By trying to capture every view, I fear I have mangled them all.

Your hypothesis seems to include itself as a premise? Is this correct? I am sorry that I have to ask this, I lack a lot of education :-(

If you define "Singularity" as "an AI going to superintelligence quickly" then yeah, it does, and that shouldn't be a premise. I was defining "Singularity" as "a massive change to the world as we know it, probably resulting in something either very awesome or very horrible.

I am sorry, this sounds a bit like, "the world will end a few years or less after the first antimatter asteroid has been detected to be on a collision course with earth". Maybe it is just my complete lack of training in matters of rationality that makes me think so. I am really sorry in that case :-(

To people who believe that there will be a Singularity, it does sound like that. Some people believe that smarter-than-human AI is impossible or that it will not cause massive change to the world as we know it. Again, I appear to be using a different definition from you: if one defines a Singularity as a smarter-than-human AI, I was being tautological.

Does this mean that a hypothesis, or prediction, does not need to be specific about its possible timeframe? We just have to wait? At what point do we then decide to turn to other problems? Maybe I am completely confused here, but how do you update your risk estimations if you can't tell when a risk ceases to be imminent?

I don't know enough AI science to answer this question completely. I don't know what would be strong evidence that human level AI or higher is impossible, other than the brain turning out to be non-Turing-computable. If a human level or slightly smarter AI is developed and it does not self-improve further (or enough to drastically change the world), this would be evidence against a hard takeoff or a Singularity.

Since, as far as I can tell, in your hypothesis, smarter-than-human AI is strongly correlated with the occurrence of a Singularity, would it be reasonable to name some concrete conditions required to enable such a technology?

Other than "a good enough understanding of mind to formulate an AI and enough computing power to run it" I really don't know enough to say. Will someone with more knowledge of AI please fill in the gaps in my explanation?

To be clear, I am just trying to figure out how the proponents of explosive recursive self-improvement can be surprised by data.

A proponent of explosive recursive self-improvement can be surprised by an AI of human intelligence or slightly greater that does not go FOOM. Or by finding out that AI is in principle impossible (though proving something in principle impossible is very hard).

Maybe this is perfectly clear for everyone else, I am sorry, I don't know where else to ask about this.

This is the right place, but I'm not the best person. Again, I'd love for somebody who knows some AI to help with the questions I couldn't answer.

Is there a LW post that explains the basic terminology, like what constitutes a 'model', what models are subject to revision and when to abandon a model completely?

This seems fairly neat:

In the most general sense, a model is anything used in any way to represent anything else.

It would help me a lot if someone could write a post explaining the difference between a hypothesis that can be falsified and a prediction.

Predictions are specifically concerned with the future. Hypotheses should be consistent with all observations - no matter when they are made.