Followup to: The Outside View's Domain, Conversation Halters
Reply to: Reference class of the unclassreferenceable
In "conversation halters", I pointed out a number of arguments which are particularly pernicious, not just because of their inherent flaws, but because they attempt to chop off further debate - an "argument stops here!" traffic sign, with some implicit penalty (at least in the mind of the speaker) for trying to continue further.
This is not the right traffic signal to send, unless the state of knowledge is such as to make an actual halt a good idea. Maybe if you've got a replicable, replicated series of experiments that squarely target the issue and settle it with strong significance and large effect sizes (or great power and null effects), you could say, "Now we know." Or if the other is blatantly privileging the hypothesis - starting with something improbable, and offering no positive evidence to believe it - then it may be time to throw up hands and walk away. (Privileging the hypothesis is the state people tend to be driven to, when they start with a bad idea and then witness the defeat of all the positive arguments they thought they had.) Or you could simply run out of time, but then you just say, "I'm out of time", not "here the gathering of arguments should end."
But there's also another justification for ending argument-gathering that has recently seen some advocacy on Less Wrong.
An experimental group of subjects were asked to describe highly specific plans for their Christmas shopping: Where, when, and how. On average, this group expected to finish shopping more than a week before Christmas. Another group was simply asked when they expected to finish their Christmas shopping, with an average response of 4 days. Both groups finished an average of 3 days before Christmas. Similarly, Japanese students who expected to finish their essays 10 days before deadline, actually finished 1 day before deadline; and when asked when they had previously completed similar tasks, replied, "1 day before deadline." (See this post.)
Those and similar experiments seem to show us a class of cases where you can do better by asking a certain specific question and then halting: Namely, the students could have produced better estimates by asking themselves "When did I finish last time?" and then ceasing to consider further arguments, without trying to take into account the specifics of where, when, and how they expected to do better than last time.
From this we learn, allegedly, that "the 'outside view' is better than the 'inside view'"; from which it follows that when you're faced with a difficult problem, you should find a reference class of similar cases, use that as your estimate, and deliberately not take into account any arguments about specifics. But this generalization, I fear, is somewhat more questionable...
For example, taw alleged upon this very blog that belief in the 'Singularity' (a term I usually take to refer to the intelligence explosion) ought to be dismissed out of hand, because it is part of the reference class "beliefs in coming of a new world, be it good or evil", with a historical success rate of (allegedly) 0%.
Of course Robin Hanson has a different idea of what constitutes the reference class and so makes a rather different prediction - a problem I refer to as "reference class tennis":
Taking a long historical long view, we see steady total growth rates punctuated by rare transitions when new faster growth modes appeared with little warning. We know of perhaps four such "singularities": animal brains (~600MYA), humans (~2MYA), farming (~1OKYA), and industry (~0.2KYA)...
Excess inside viewing usually continues even after folks are warned that outside viewing works better; after all, inside viewing better show offs inside knowledge and abilities. People usually justify this via reasons why the current case is exceptional. (Remember how all the old rules didn’t apply to the new dotcom economy?) So expect to hear excuses why the next singularity is also an exception where outside view estimates are misleading. Let’s keep an open mind, but a wary open mind.
If I were to play the game of reference class tennis, I'd put recursively self-improving AI in the reference class "huge mother#$%@ing changes in the nature of the optimization game" whose other two instances are the divide between life and nonlife and the divide between human design and evolutionary design; and I'd draw the lesson "If you try to predict that things will just go on sorta the way they did before, you are going to end up looking pathetically overconservative".
And if we do have a local hard takeoff, as I predict, then there will be nothing to say afterward except "This was similar to the origin of life and dissimilar to the invention of agriculture". And if there is a nonlocal economic acceleration, as Robin Hanson predicts, we just say "This was similar to the invention of agriculture and dissimilar to the origin of life". And if nothing happens, as taw seems to predict, then we must say "The whole foofaraw was similar to the apocalypse of Daniel, and dissimilar to the origin of life or the invention of agriculture". This is why I don't like reference class tennis.
But mostly I would simply decline to reason by analogy, preferring to drop back into causal reasoning in order to make weak, vague predictions. In the end, the dawn of recursive self-improvement is not the dawn of life and it is not the dawn of human intelligence, it is the dawn of recursive self-improvement. And it's not the invention of agriculture either, and I am not the prophet Daniel. Point out a "similarity" with this many differences, and reality is liable to respond "So what?"
I sometimes say that the fundamental question of rationality is "Why do you believe what you believe?" or "What do you think you know and how do you think you know it?"
And when you're asking a question like that, one of the most useful tools is zooming in on the map by replacing summary-phrases with the concepts and chains of inferences that they stand for.
Consider what inference we're actually carrying out, when we cry "Outside view!" on a case of a student turning in homework. How do we think we know what we believe?
Our information looks something like this:
- In January 2009, student X1 predicted they would finish their homework 10 days before deadline, and actually finished 1 day before deadline.
- In February 2009, student X1 predicted they would finish their homework 9 days before deadline, and actually finished 2 days before deadline.
- In March 2009, student X1 predicted they would finish their homework 9 days before deadline, and actually finished 1 day before deadline.
- In January 2009, student X2 predicted they would finish their homework 8 days before deadline, and actually finished 2 days before deadline.
- And so on through 157 other cases.
- Furthermore, in another 121 cases, asking students to visualize specifics actually made them more optimistic.
Therefore, when new student X279 comes along, even though we've never actually tested them before, we ask:
"How long before deadline did you plan to complete your last three assignments?"
They say: "10 days, 9 days, and 10 days."
We ask: "How long before did you actually complete them?"
They reply: "1 day, 1 day, and 2 days".
We ask: "How long before deadline do you plan to complete this assignment?"
They say: "8 days."
Having gathered this information, we now think we know enough to make this prediction:
"You'll probably finish 1 day before deadline."
They say: "No, this time will be different because -"
We say: "Would you care to make a side bet on that?"
We now believe that previous cases have given us strong, veridical information about how this student functions - how long before deadline they tend to complete assignments - and about the unreliability of the student's planning attempts, as well. The chain of "What do you think you know and how do you think you know it?" is clear and strong, both with respect to the prediction, and with respect to ceasing to gather information. We have historical cases aplenty, and they are all as similar to each other as they are similar to this new case. We might not know all the details of how the inner forces work, but we suspect that it's pretty much the same inner forces inside the black box each time, or the same rough group of inner forces, varying no more in this new case than has been observed on the previous cases that are as similar to each other as they are to this new case, selected by no different a criterion than we used to select this new case. And so we think it'll be the same outcome all over again.
You're just drawing another ball, at random, from the same barrel that produced a lot of similar balls in previous random draws, and those previous balls told you a lot about the barrel. Even if your estimate is a probability distribution rather than a point mass, it's a solid, stable probability distribution based on plenty of samples from a process that is, if not independent and identically distributed, still pretty much blind draws from the same big barrel.
You've got strong information, and it's not that strange to think of stopping and making a prediction.
But now consider the analogous chain of inferences, the what do you think you know and how do you think you know it, of trying to take an outside view on self-improving AI.
What is our data? Well, according to Robin Hanson:
- Animal brains showed up in 550M BC and doubled in size every 34M years
- Human hunters showed up in 2M BC, doubled in population every 230Ky
- Farmers, showing up in 4700BC, doubled every 860 years
- Starting in 1730 or so, the economy started doubling faster, from 58 years in the beginning to a 15-year approximate doubling time now.
From this, Robin extrapolates, the next big growth mode will have a doubling time of 1-2 weeks.
So far we have an interesting argument, though I wouldn't really buy it myself, because the distances of difference are too large... but in any case, Robin then goes on to say: We should accept this estimate flat, we have probably just gathered all the evidence we should use. Taking into account other arguments... well, there's something to be said for considering them, keeping an open mind and all that; but if, foolishly, we actually accept those arguments, our estimates will probably get worse. We might be tempted to try and adjust the estimate Robin has given us, but we should resist that temptation, since it comes from a desire to show off insider knowledge and abilities.
And how do we know that? How do we know this much more interesting proposition that it is now time to stop and make an estimate - that Robin's facts were the relevant arguments, and that other arguments, especially attempts to think about the interior of an AI undergoing recursive self-improvement, are not relevant?
Well... because...
- In January 2009, student X1 predicted they would finish their homework 10 days before deadline, and actually finished 1 day before deadline.
- In February 2009, student X1 predicted they would finish their homework 9 days before deadline, and actually finished 2 days before deadline.
- In March 2009, student X1 predicted they would finish their homework 9 days before deadline, and actually finished 1 day before deadline.
- In January 2009, student X2 predicted they would finish their homework 8 days before deadline, and actually finished 2 days before deadline...
It seems to me that once you subtract out the scary labels "inside view" and "outside view" and look at what is actually being inferred from what - ask "What do you think you know and how do you think you know it?" - that it doesn't really follow very well. The Outside View that experiment has shown us works better than the Inside View, is pretty far removed from the "Outside View!" that taw cites in support of predicting against any epoch. My own similarity metric puts the latter closer to the analogies of Greek philosophers, actually. And I'd also say that trying to use causal reasoning to produce weak, vague, qualitative predictions like "Eventually, some AI will go FOOM, locally self-improvingly rather than global-economically" is a bit different from "I will complete this homework assignment 10 days before deadline". (The Weak Inside View.)
I don't think that "Outside View! Stop here!" is a good cognitive traffic signal to use so far beyond the realm of homework - or other cases of many draws from the same barrel, no more dissimilar to the next case than to each other, and with similarly structured forces at work in each case.
After all, the wider reference class of cases of telling people to stop gathering arguments, is one of which we should all be wary...
I think there probably is a good reference class for predictions surrounding the singularity. When you posted on "what is wrong with our thoughts? you identified it: the class of instances of the human mind attempting to think and act outside of its epistemologically nurturing environment of clear feedback from everyday activities.
See, e.g. how smart humans like Stephen Hawking, Ray Kurzweil, Kevin Warwick, Kevin Kelly, Eric Horowitz, etc have all managed to say patently absurd things about the issue, and hold mutually contradictory positions, with massive overconfidence in some cases. I do not exclude myself from the group of people who have said absurd things about the Singularity, and I think we shouldn't exclude Eliezer either. At least Eliezer has put in massive amounts of work for what may well be the greater good of humanity, which is morally commendable.
To escape from this reference class, and therefore from the default prediction of insanity, I think that bringing in better feedback and a large diverse community of researchers might work. Of course, more feedback and more researchers = more risk according to our understanding of AI motivations. But ultimately, that's an unavoidable trade-off; the lone madman versus the global tragedy of the commons.
Logged in to vote this up...
However I wouldn't go the lots of people route either. At least not until decent research norms had been created.
The research methodology that has been mouldering away in my brain for the past few years is the following:
We can agree that computational systems might be dangerous (in the FOOM sense).
So let us start from the basics and prove that bits of computer space aren't dangerous either by experiments we have already done (or have been done by nature) or by formal proof.
Humanity has played around with basic computers and net... (read more)