Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.

Assessing Kurzweil: the results

42 Post author: Stuart_Armstrong 16 January 2013 04:51PM

Predictions of the future rely, to a much greater extent than in most fields, on the personal judgement of the expert making them. Just one problem - personal expert judgement generally sucks, especially when the experts don't receive immediate feedback on their hits and misses. Formal models perform better than experts, but when talking about unprecedented future events such as nanotechnology or AI, the choice of the model is also dependent on expert judgement.

Ray Kurzweil has a model of technological intelligence development where, broadly speaking, evolution, pre-computer technological development, post-computer technological development and future AIs all fit into the same exponential increase. When assessing the validity of that model, we could look at Kurzweil's credentials, and maybe compare them with those of his critics - but Kurzweil has given us something even better than credentials, and that's a track record. In various books, he's made predictions about what would happen in 2009, and we're now in a position to judge their accuracy. I haven't been satisfied by the various accuracy ratings I've found online, so I decided to do my own assessments.

I first selected ten of Kurzweil's predictions at random, and gave my own estimation of their accuracy. I found that five were to some extent true, four were to some extent false, and one was unclassifiable 

But of course, relying on a single assessor is unreliable, especially when some of the judgements are subjective. So I started a call for volunteers to get assessors. Meanwhile Malo Bourgon set up a separate assessment on Youtopia, harnessing the awesome power of altruists chasing after points.

The results are now in, and they are fascinating. They are...

Ooops, you thought you'd get the results right away? No, before that, as in an Oscar night, I first want to thank assessors William Naaktgeboren, Eric Herboso, Michael Dickens, Ben Sterrett, Mao Shan, quinox, Olivia Schaefer, David Sønstebø and one who wishes to remain anonymous. I also want to thank Malo, and Ethan Dickinson and all the other volunteers from Youtopia (if you're one of these, and want to be thanked by name, let me know and I'll add you).

It was difficult deciding on the MVP - no actually it wasn't, that title and many thanks go to Olivia Schaefer, who decided to assess every single one of Kurzweil's predictions, because that's just the sort of gal that she is.

The exact details of the methodology, and the raw data, can be accessed through here. But in summary, volunteers were asked to assess the 172 predictions (from the "Age of Spiritual Machines") on a five point scale: 1=True, 2=Weakly True, 3=Cannot decide, 4=Weakly False, 5=False. If we total up all the assessments made by my direct volunteers, we have:

As can be seen, most assessments were rather emphatic: fully 59% were either clearly true or false. Overall, 46% of the assessments were false or weakly false, and and 42% were true or weakly true.

But what happens if, instead of averaging across all assessments (which allows assessors who have worked on a lot of predictions to dominate) we instead average across the nine assessors? Reassuringly, this makes very little difference:

What about the Youtopia volunteers? Well, they have a decidedly different picture of Kurzweil's accuracy:

This gives a combined true score of 30%, and combined false score of 57%! If my own personal assessment was the most positive towards Kurzweil's predictions, then Youtopia's was the most negative.

Putting this all together, Kurzweil certainly can't claim an accuracy above 50% - a far cry from his own self assessment of either 102 out of 108 or 127 out of 147 correct (with caveats that "even the predictions that were considered 'wrong' in this report were not all wrong"). And consistently, slightly more than 10% of his predictions are judged "impossible to decide".

As I've said before, these were not binary yes/no predictions - even a true rate of 30% is much higher that than chance. So Kurzweil remains an acceptable prognosticator, with very poor self-assessment.

Comments (59)

Comment author: satt 15 January 2013 06:31:55PM *  1 point [-]

I've been looking forward to this. Looking at the raw data now to get an idea of the inter-rater agreement. The two columns of Youtopia ratings agree fairly well on the 33 predictions where they overlap, and the 9 LW raters seem to disagree more, but that's only my first impression. (Maybe it's just that inter-rater variation is more obvious for predictions with ≥3 ratings.) Thanks again to all the assessors for putting in the legwork.

Comment author: Stuart_Armstrong 15 January 2013 06:38:45PM *  2 points [-]

There are more than two youtopia raters. Different people, at different times, completed the assessments individually (and sometimes did "second opinions" if someone had already done that one before them). I think they were 5 assessors in total.

Comment author: satt 15 January 2013 07:13:36PM 1 point [-]

Oops. Fixed.

Comment author: RobbBB 16 January 2013 10:58:35PM *  10 points [-]

There are a variety of reasons interpreters might think that a prediction didn't come true, while Kurzweil boldly claims that it did:

  1. Kurzweil didn't express himself clearly, so interpreters misunderstood what the prediction really was. Miscommunication adds random noise, and most randomly generated predictions will turn out false, so this will skew the results against Kurzweil.

  2. Kurzweil's prediction was vague. So charitable interpreters will think they're basically true, while less charitable interpreters will think they're basically false. And we can expect random LessWrongers to be less charitable toward Kurzweil than Kurzweil is toward Kurzweil.

  3. Interpreters tend to be factually mistaken about current events, in a specific direction: They are ignorant of the nature, existence, or prevalence of the latest innovations in technology and culture.

  4. Kurzweil tends to be factually mistaken about current events, in a specific direction: He thinks a variety of technologies are more advanced, and more widespread, than they really are.

  5. There are systemic differences in the evaluation scales used by Kurzweil and by others. For instance, Kurzweil and Armstrong individuate 'predictions' differently, lumping and splitting at different points in the source text. There may also be systemic disagreements about how (temporally and technologically) precise an interpretation must be to count as 'correct,' and about whether grammatical forms like 'X is Y' most closely means 'X is always Y', 'X is usually Y', 'X is commonly Y', 'X is sometimes (occasionally) Y', or 'X is Y at least once'. This ties into vagueness, but may bias the results due to linguistic variation rather than just as a result of generic degree of interpretive charity.

I'm particularly curious about testing 3, since the strongest criticism Kurzweil could make of our methodology for assessing his accuracy is that our reviewers simply got the facts wrong. We can calibrate our assumptions about the accuracy and up-to-dateness of LessWrongers regarding technology generally. Or more specifically we can expose them to Kurzweil's arguments and see how much their assessment of his predictive success changes after hearing why he thinks he got a certain prediction 'correct'.

Comment author: Decius 30 January 2013 12:54:24AM *  2 points [-]

With the advent of multi-core architectures, these devices are starting to have 2, 4, 8… computers each in them, so we’ll exceed a dozen computers “on and around their bodies” very soon. One could argue that it is “typical” already, but it will become very common within a couple of years.

There's clearly a disconnect between his 'computer' and the general meaning of 'computer'; A multicore processor isn't more than one computer, and it wasn't in 1990.

Also, he seems to regard things as 'typical' that I would call 'common'; I say 'common' when it isn't surprising to see something, and 'typical' when it is surprising to note it's absence, while he seems to use 'typical' for things which are not surprising, and 'common' for things which are commercially available (regardless of cost or prevalence)

Comment author: mfb 22 January 2013 07:11:25PM *  0 points [-]

I think (5.) can give a significant difference (together with 1 and 2 - I would not expect so much trouble from 3 and 4). Imagine a series of 4 statements, where the last three basically require the first one. If all 4 are correct, it is easy to check every single statement, giving 4 correct predictions. But if the first one is wrong - and the others have to be wrong as consequence - Kurzweil might count the whole block as one wrong prediction.

For predictions judged by multiple volunteers, it might be interesting to check how much they deviate from each other. This gives some insight how important (1.) to (3.) are. satt looked at that, but I don't know which conclusion we can draw from that.

Comment author: Friendly-HI 21 January 2013 08:01:45PM *  2 points [-]

@ Everyone:

What are the most interesting and useful conclusions we can reasonably draw from this?

I'm not being facetious, it's just that after I've read this and most of the top rated comments I'm not sure what to draw from all of this. We have a rough estimate of how K. is doing in absolute terms, but not in relative terms because we're left without a baseline to compare him to. Chance or the "average predictions of the average human" can't be a meaningful baseline (for me) because I'm not going to use them as potential sources for my personal predictions/beliefs anyway. What actually interests me is how seriously I can take Kurzweil's predictions for the upcoming future (!) in comparison to other competent predictos. But how K. is doing in comparison to other predictors is very hard to judge because we simply can't standardise relatively vague predictions by different people in order to reasonably compare them.

So what I am left with is only that a bunch of random better-than-average informed people (regarding current technology) estimated that slightly less than one third of K's predictions came true, one third was hard to judge and consists of shades of grey somewhere between definitely true and definitely false, and one third was judged as plain false. So the only thing I really take away from this is that K. seems like a reasonably competent predictor in absolute terms, since any given prediction of his had roughly the same chance of leaning towards "true" as towards "false". Assuming he keeps this rate up for the upcoming decade(s) my ultimate takeaway for now is that he's at least worth reading for inspiration.

Also, I take away that his self-assessment of accuracy is probably either iffy at best or plain dishonest at worst. But to judge this point further I'd have to read his personal accounts on each of his predictions and the specific reasons why he apparently counted many of them as "essentially true", while most technophiles didn't.

Comment author: EricHerboso 22 January 2013 06:53:44PM *  1 point [-]

As one of the people who contributed to this project by assessing his predictions, I do want to point out that several of the predictions marked as "True" seemed very obvious to me. Of course, this might be the result of hindsight bias, and in fact it is actually very impressive for him to have predicted something like the following examples:

  • "[Among portable computers,] Memory is completely electronic, and most portable computers do not have keyboards."
  • "However, nanoengineering is not yet considered a practical technology."
  • "China has also emerged as a powerful economic player."

Note also that some of the statements marked "True" are only vacuously true. For example, one of his wrong predictions was that "intelligent roads are in use...for long-distance travel". But he follows this up with the following prediction which got marked as "True":

"Local roads, though, are still predominantly conventional."

As you can see, I do not think that looking just at the percentage of true predictive statements he made is enough. Some of those predictions seem almost trivial. And yet we can't just dismiss them out of hand, because the reason I think they are trivial might just be because I'm looking at it from after the fact. Counterfactually, if intelligent roads had come about, but local roads were still conventional, would I still call the prediction trivial? What if local roads weren't conventional? Would I then still call it a trivial prediction?

We had no choice but to just mark such statements as true and count them in the percentage he got correct, because there's just no way I know of to disregard such "trivial" predictions. And this means we shouldn't really be looking at the percentage marked as true except to compare it with Kurzweil's own self-assessment of accuracy. Using the percentage marked as true for other reasons, like "should I trust Kurzweil's predictive power more than others'", seems like a misuse of this data.

Comment author: V_V 30 January 2013 01:24:30PM *  1 point [-]
  • "[Among portable computers,] Memory is completely electronic, and most portable computers do not have keyboards."

Is that actually true? Notebooks have keyboards and hard disks, many also have optical drives. Tablets still sale less than notebooks ( I found a prediction of tablet sales topping notebooks by 2016 ). I suppose that you can consider Kurzweil's prediction true if you count smartphones as portable computers, but I don't think that's appropriate since they are typically not used as notebook replacements.

  • "However, nanoengineering is not yet considered a practical technology."
  • "China has also emerged as a powerful economic player."

These two seem quite obvious. Why do you think they were impressive predictions?

Comment author: Stuart_Armstrong 31 January 2013 11:01:45AM 1 point [-]

"However, nanoengineering is not yet considered a practical technology."

Maybe it's because I share an office with Eric Drexler, but I get the definite impression that nanotech was expected to be something huge, back in 1999 - and maybe could have done, had the funding not been diverted to classical material science.

Comment author: V_V 31 January 2013 06:22:37PM *  0 points [-]

Enthusiasts certaily expected it, but I'm under the impression that professional chemists didn't share that view. Drexler was sharply criticized by Richard Smalley, one of the Nobel prize recipient for the discovery of buckminsterfullerene.

While Kurzweil sided with Drexler, he wasn't so far fetched to believe that nanotech was imminent.

Comment author: Stuart_Armstrong 01 February 2013 12:17:16PM 0 points [-]

Drexler has his own view on that criticism (claiming that it myopically criticised a particular type of nanotech manipulation that nobody was actually proposing to do).

But I don't have the technical ability to sort out the truth of these matters.

Comment author: V_V 01 February 2013 01:39:01PM 0 points [-]

I suppose that for a sufficiently broad definition nanotechnology includes biochemistry.

Comment author: gwern 30 January 2013 05:04:08PM *  2 points [-]

"China has also emerged as a powerful economic player."

China could, at any point, have collapsed into a Japan-style lost decade(s), and there are commentators like Pettis right now who are predicting such a collapse soon; Pettis in particular has an active bet with Economist that it will happen.

Of course in hindsight Chinese growth seems obvious. Why would anyone think that corruption would not strangle growth or that the Communist Party would not collapse or urban rioting and warfare not break out? After all just look at how China managed 7%+ for the last decade and more! Isn't it obvious that China would just keep growing and not stall out or collapse?

But then again, people used to praise the wisdom of MITI (Communist Party) in guiding Japanese (Chinese) growth and speculate about when the Japanese economy would surpass the American economy to become the largest in the world and explain how sky-high property prices in Tokyo (Beijing) were perfectly justified.

If you think it's really that "quite obvious", perhaps you should go wager a thousand bucks with Pettis or on Long Bets or something on whether Chinese growth will exceed, say, 5% for the next decade...

Comment author: V_V 30 January 2013 05:54:29PM *  -1 points [-]

China could, at any point, have collapsed into a Japan-style lost decade(s)

Japan is a powerful economic player, and China has more than ten times its population. If China "collapsed" to Japan's per capita GDP it would be by far the largest world economy.

If you think it's really that "quite obvious", perhaps you should go wager a thousand bucks with Pettis or on Long Bets or something on whether Chinese growth will exceed, say, 5% for the next decade...

Next decade is another story.

EDIT:

According to Wikipedia, in 1999 China was already the seventh world economy by nominal GDP.

Comment author: gwern 30 January 2013 06:11:39PM 3 points [-]

Japan is a powerful economic player, and China has more than ten times its population. If China "collapsed" to Japan's per capita GDP it would be by far the largest world economy.

You can stagnate/collapse without having reached Japan's per capita GDP: http://en.wikipedia.org/wiki/Middle_income_trap

Next decade is another story.

Of course it is. I'm sure you would in 1998/1999 have said 'it's obvious that China would grow like gangbusters up to 2013, but 2013-2023 - well, I just don't know!', right? You'll pardon me if this looks more like hindsight bias + excuse-making for why you won't extend the 'obvious' prediction out another decade.

Comment author: V_V 30 January 2013 11:37:25PM *  0 points [-]

You can stagnate/collapse without having reached Japan's per capita GDP: http://en.wikipedia.org/wiki/Middle_income_trap

And where are the manufacturing jobs going to go? Africa is still to much behind in terms of infrastructures and political stability.

Of course it is. I'm sure you would in 1998/1999 have said 'it's obvious that China would grow like gangbusters up to 2013, but 2013-2023 - well, I just don't know!', right? You'll pardon me if this looks more like hindsight bias + excuse-making for why you won't extend the 'obvious' prediction out another decade.

Well, think what you want. In 1999 China was the 7th world economy by GDP, and had the highest GDP growth of the seven. Pretty much every consumer product was already "made in China". Was it so difficult to predict that China would have kept growing through the decade?

For the next decade, I expect China to become the world largest economy by total GDP, though I'm not betting on the exact growth rate.

Comment author: gwern 31 January 2013 02:10:23AM 1 point [-]

And where are the manufacturing jobs going to go?

Any poorer or more reliable country; so maybe Africa - but maybe the USA or Germany.

(And of course, we've all heard about Foxconn investing heavily in robotics, which is the sort of trend that might preserve some manufacturing-based GDP growth - at the price of increased economic inequality and through that trend, increase various low but catastrophic risks like war or revolution.)

In 1999 China was the 7th world economy by GDP, and had the highest GDP growth of the seven.

Making it the most likely to regress to the mean or fail to turn in continued exceptional performance. When you're growing that fast, there's not much your growth rate can do but go down at some point.

Pretty much every consumer product was already "made in China". Was it so difficult to predict that China would have kept growing through the decade?

It sure isn't in hindsight.

Comment author: V_V 31 January 2013 07:12:51AM *  0 points [-]

Any poorer or more reliable country; so maybe Africa - but maybe the USA or Germany.

The Chinese government seems highly reliable, and before Americans or German workers have lower salaries than Chinese workers, China would be the world's largest economy.

(And of course, we've all heard about Foxconn investing heavily in robotics, which is the sort of trend that might preserve some manufacturing-based GDP growth - at the price of increased economic inequality and through that trend, increase various low but catastrophic risks like war or revolution.)

Just like industrial automation increased the risks of war and revolution in first world countries?

Making it the most likely to regress to the mean or fail to turn in continued exceptional performance. When you're growing that fast, there's not much your growth rate can do but go down at some point.

These trends usually don't change abruptly. You can't extrapolate them to 50 years, but 10 years seems reasonable. Moreover, as I answered to Vaniver, there were more fundamental reason for why China was expected to grow more than Japan or other first world countries.

It sure isn't in hindsight.

Whatever.
Pick your favorite Bayesian-Solomonoffian-Yudkowskyian-Muehlhauserian-<insert strange surname>ian method and make your best predicion about the size of the Chinese economy in 2009 using only data available up to 1999. Let's see if your techniques pay rent.

Comment author: gwern 31 January 2013 05:11:51PM *  2 points [-]

The Chinese government seems highly reliable,

Surely you're kidding? This is the same Chinese government that went through an internal power struggle over Bao during the just past decadal transfer of power, which is almost as opaque as North Korea, and which just yesterday was revealed to have hacked the NYT's entire internal network as retaliation for reporting on the premier's relatives accumulating billions in suspiciously-obtained wealth and to obtain the names of anyone who helped the NYT investigation? This is the image of a reliable government?

These trends usually don't change abruptly.

How abruptly did the Japanese trend change? Feel free to look it up.

Pick your favorite Bayesian-Solomonoffian-Yudkowskyian-Muehlhauserian-<insert strange surname>ian method and make your best predicion about the size of the Chinese economy in 2009 using only data available up to 1999.

You want someone who is pointing out hindsight bias to go and engage in a worthless exercise one knows in advance will be contaminated by hindsight bias? I'm not sure what point you're trying to make here.

Comment author: CCC 31 January 2013 08:15:48AM 5 points [-]

Pick your favorite Bayesian-Solomonoffian-Yudkowskyian-Muehlhauserian-<insert strange surname>ian method and make your best predicion about the size of the Chinese economy in 2009 using only data available up to 1999. Let's see if your techniques pay rent.

There is a certain temptation here, to pick and choose data that was available in 1999 that leads to a correct conclusion about 2009. This may be unintentional - the result of noting that the result is correct and then not bothering to double-check the sources, or of noting that the result is incorrect and then ruthlessly double-checking the sources and eliminating or updating some, or possibly altering some parts of the predictive model used, until such time as the result is correct.

I would therefore, personally, be more impressed about a correct prediction, made now, with regards to the size of the Chinese economy in 2023, than by a correct prediction, made now, with regards to the size of the Chinese economy in 2009, regardless of what information is used to make the prediction.

Comment author: Vaniver 30 January 2013 06:07:15PM *  1 point [-]

If China "collapsed" to Japan's per capita GDP it would be by far the largest world economy.

He is referring to the Lost Decade, a period of slow GDP growth in Japan following an asset price bubble; it's not clear to me why you are referring to absolute per capita GDP values.

Comment author: V_V 30 January 2013 11:06:49PM *  0 points [-]

Japan is a country with high per-capita GDP, very high HDI, high population density (home to the world largest megalopolis ), cutting edge technology, low natural resources.
It's economy can't grow by implementing already existing technologies, or increasing the population, or using more land, or exploiting more domestic natural resources. It can only grow by technological development, which, in many core areas, is quite slow: a modern car may have all kinds of digital gizmos, but it's only marginally more efficient than a car made 10 or 20 years ago, both in terms of fuel consumption and material costs.

China is far from that point, and was ever further in 1999. It's population is more or less artificially capped, but it has lots of usable land, natural resources, lot of room for technological improvement using already existing technologies, room for increase in internal consumption.
Maybe it will never reach Japan-level per capita wealth, maybe in the long run Japan and other first world countries' wealth will fall and they will equalize with China at some intermediate point, maybe they will keep oscillating, maybe world economy will collapse.

I can't make long-term predictions (other than world economy will probably not keep growing for more than 50-100 years), but I expect that China will keep growing through the next decade. Fourteen years ago, that prediction would have been even stronger.

Comment author: somervta 20 January 2013 08:48:39AM 1 point [-]

Would it be possible for you to send me the original data with the comments/justifications attached/ I'm interested in doing a side-by-side comparison with Kurzweil's own analysis of his predictions.

Comment author: Stuart_Armstrong 20 January 2013 10:51:01AM *  1 point [-]

Most of the assessments didn't have comments or justifications (maybe about 1/4 did). I think the assessors would feel uncomfortable having those published (some are in very informal style). And, as I said, it wouldn't be a fair or systematic assessment - the comments weren't intended for that purpose.

Comment author: EricHerboso 22 January 2013 07:03:12PM 0 points [-]

I am only one of the contributors, but you're welcome to view my comments. I doubt it will be helpful for your purpose, though.

Comment author: somervta 23 January 2013 02:24:19AM 0 points [-]

I'll see what I can do with them. It may be useful, even if I can only do a partial comparison.

Comment author: BlueSun 16 January 2013 07:36:57PM 3 points [-]

I wonder if we could get Pundit Tracker to start tracking him? They've mentioned tracking technology pundits in the future.

Comment author: Stuart_Armstrong 17 January 2013 09:23:57AM 2 points [-]

Can you suggest it to them?

Comment author: MTGandP 15 January 2013 04:45:08PM 9 points [-]

You make a quick statement at the end about how Kurzweil does better than random chance. But I wonder how we'd assess that? I'd guess that, if he's getting 50% correct or weakly correct, he's doing better than random chance because many (most?) of his claims are far-fetched.

I've thought of a way to test this, although it will take another ten years:

Kurzweil makes a bunch of predictions about what will happen by 2023. Then you have a bunch of non-experts decide which of his predictions they agree with. After 10 years, we can measure how much better Kurzweil did than the non-experts.

Comment author: cjemmott 16 January 2013 07:39:10PM 12 points [-]

I think I can do this! I read "The Age of Spiritual Machines" when it came out, and remember marking in the margins about whether or not I agreed with each. I was in high school at the time, and think I left the book at home when I left for college. I will see if it is still there.

Though I also agree with the comment from handofixue that making the predictions is much harder than judging them.

Comment author: MTGandP 17 January 2013 12:09:23AM 2 points [-]

Very cool! I'd love to see that. What year did you do this?

Comment author: handoflixue 15 January 2013 08:48:52PM 13 points [-]

In fairness, it would seem that simply coming up with the prediction is probably a lot of the work.

As a metaphor: it's relatively easy to walk non-experts through a proof of Godel's Incompleteness Theorem. The hard part is often coming up with the idea in the first place, or proving it's correctness; simply agreeing on a proof or theory is vastly easier :)

Comment author: TylerJay 13 April 2014 05:57:54AM 1 point [-]

For anyone who hasn't read it, see locating the hypothesis

Some of the predictions are affected by this more than others, but it's hard to judge in any case. For example, the "nanotechnology is prevalent" hypothesis wouldn't be that hard to locate, given that a lot of people were talking about nanotechnology at the time. Then it's just a matter of deciding yes or no based on evidence and your model(s). On the other hand, something like his prediction that "Personal computers with high resolution interface embedded in clothing and jewelry, networked in Body LAN’s," while wrong, is much harder to locate in the hypothesis space.

Comment author: shminux 15 January 2013 04:44:58PM 12 points [-]

even a true rate of 30% is much higher that than chance.

What is the chance rate, and how do you calculate it?

Comment author: Stuart_Armstrong 15 January 2013 05:51:14PM 11 points [-]

Subjective impression. The predictions are so varied and sometimes so ambiguous, there's no decent way of establishing a base rate. But picking some predictions at random, they appear to be quite specific (certainly better than a dart throwing chimp):

"For the most part, these truly personal computers have no moving parts." "Unused computes on the Internet are being harvested, creating virtual parallel supercomputers with human brain hardware capacity." "The security of computation and communication is the primary focus of the U.S. Department of Defense." "Military conflicts between nations are rare, and most conflicts are between nations and smaller bands of terrorists."

Comment author: alex_zag_al 16 January 2013 07:28:14PM 2 points [-]

A chance rate isn't the right thing to compare to, I think. It would have to be randomly generated predictions, wouldn't it? But any non-expert human will do much better than that, since basic knowledge such as that the Earth will stay in orbit around the sun rules out most of these.

I think the right thing to compare to is if he did significantly better than I would have. Which he probably did, which means I can improve my vision of the future by reading Kurzweil.

Comment author: shminux 16 January 2013 08:31:34PM 2 points [-]

I think the right thing to compare to is if he did significantly better than I would have.

How do you know how you would have done? have you tried?

Comment author: JoshuaFox 16 January 2013 07:59:29AM 12 points [-]

I'd also like to compare Kurzweil against the success rate of other predictors.

Some predictions might be very obvious.

Comment author: JoshuaFox 17 January 2013 08:07:26AM *  2 points [-]

I don't know which predictions, if any, are obvious, but by comparing Kurzweil to other predictors at the same time when he wrote the book (if there were any), we could see how much better he does.

Comment author: CarlShulman 16 January 2013 08:04:20PM 6 points [-]

Yes, if one has a source of abundant likely, obvious predictions one can arbitrarily 'juice' one's overall accuracy rate even if most of the surprising predictions go wrong. On the other hand, judging 'obviousness' in hindsight is very tricky.

One also has to pay attention to the independence of predictions. E.g. one could predict the continuation of Moore's Law as one prediction or as many predictions with connected answers: a prediction about chips in laptops, a prediction about chips in supercomputers, a prediction about the performance of algorithms with well-understood hardware scaling, etc. In the extreme, one could make 1000 predictions about computer performance in consecutive minutes, which would almost certainly rise or fall together.

Kurzweil's separate predictions aren't perfectly correlated (e.g. serial speed broke off from supercomputer performance in FLOPS) but many of them are far from independent.

Comment author: TechnoToad 29 January 2013 10:13:35PM -1 points [-]

Carl is basically pointing out that assessing predictions is tricky business, because it's hard to be objective.

Here are a few points that need to be taken into account:

1. People have a lot to gain from being pessimistically defensive. It prevents them from being disappointed at some point in the future. The option for being pleasantly surprised, remains open. Being defensively pessimistic also prevents you from looking crazy to your peers. After all... who wants to be the only one in a group of 10 to think that by 2030 we'll have nanobots in our brains?

2. The poster assessed Kurzweil's predictions because he felt the need to do so. Why did he feel the need to do so? Is this about defensive pessimism?

3. It is safe to assume that a random selection of assessors would be biased towards judging 'False' for two obvious reasons. The first is the fact that they are uninformed about technology and simply aren't able to properly judge the lion's share of all predictions. The second is defensive pessimism.

4. Why is it judged that a 30% 'Strong True' is a weak score? In comparison to the predictions of futurologists before Kurzweil, 30% seems like an excellent score to me. It strikes me as a score that a man with a blurred vision of the future would have. But blurred vision of the future is all you can ever have. If the future were here, you'd be able to see it sharply in focus. Having blurred vision of the future is a real skill. Most people (SL0) have no vision of the future whatsoever.

5. How many years does a prediction have to be off in order for it to be wrong? How would you determine this number of years objectively?

6. Why did the assessors have to go with the 5-step-true-to-false system? Is that really the best way of assessing a futurologists predictions? I understand that we are a group of rational people here, but sometimes, you've gotta let go of the numbers, the measurements, the data and the whole binary thinking. Sometimes, you have to go back to just being a guy with common sense.

Take Kurzweil's long standing predictions for solar power, for example. He's been predicting for years that the solar tipping point would be around 2010. Spain hit grid parity in 2012 and news outlets are saying that the USA and parts of Europe will hit grid parity in 2013.

Calling Kurzweil's prediction on solar power wrong just because it's happening 2 to 3 years after 2010, is wrong in my opinion.

Kurzweil deserves some slack here. In the 1980s he predicted a computer would beat a human chess player in 1998. And that ended up happening a year earlier in 1997.

Kurzweil has blurry vision of the future. He might be a genius, but he is also just a human being that doesn't have anything better to go on than big data. Simple as that.

Instead of bickering about his predictions, we would do better to just look at the big picture of things.

Nanotech, wireless, robotics, biotech, AI... all of it is happening.

And be honest about Google's self driving car, which came out 2 years ago already: that was just an unexpected leap into the future right there!

I don't think Kurzweil himself saw self driving cars coming in 2011 already.

And to really hammer the point home, the self driving car had thousands of registered miles when it was introduced at the start of 2011. So it was probably already finished in 2010.

For all we know, the Singularity will occur in 2030. We just don't know.

Kurzweil has brought awareness to the world. Rather than sit around and count all the right and the wrong ones as the years pass by, the world would do better if it tried turning those predictions into self fullfilling prophecies.

Comment author: MichaelAnissimov 16 January 2013 06:33:43PM 0 points [-]

Which predictions are very obvious?

Comment author: EricHerboso 22 January 2013 06:59:32PM 3 points [-]

As a (perhaps) trivial example, consider the pair of predictions:

  • "Intelligent roads are in use, primarily for long-distance travel."
  • "Local roads, though, are still predominantly conventional."

As one of the people who participated in this study, I marked the first as false and the second as true. Yet the second "true" prediction seems like it is only trivially true. (Or perhaps not; I might be suffering from hindsight bias here.)

Comment author: V_V 30 January 2013 01:43:52PM 0 points [-]

But why was this counted as two separate predictions? The two statements are even syntactically linked by the "though" conjunction.

Comment author: TheOtherDave 30 January 2013 02:56:08PM *  2 points [-]

Why oughtn't it be? The construction "A, though B" is an independent assertion of A and B. Syntactic linkage is not enough to establish contingency.

It is not like "A, because B" for example, where it's arguably unfair if B and A are both false to count it as two failures... in that case, the claim of A can be seen as contingent on the claim of B, and not independent.

To put this differently, "A, though B" makes the following claims:
A
B
You might (mistakenly) expect -B given A, which is why I mention B explicitly.

Whereas "A, because B" makes the following claims:
B
If B, then A

If A happens in the first case, the first claim is correct. If B happens, the second is correct. If both happen, both claims are correct.

If A happens in the first case but B doesn't, the first claim is correct and the second claim is unevaluatable.

(I suppose one could argue that the second case implicitly claims "if -B, then -A" as well... "because" is somewhat ambiguous in English.)

Comment author: Kindly 30 January 2013 03:06:32PM 2 points [-]

This is only a problem because we haven't been comparing the relative "difficulty" of predictions. Admittedly this is hard to do; but I think it's clear that:

  1. "Intelligent roads are in use, primarily for long-distance travel." is a much more ambitious prediction than "Local roads, though, are still predominantly conventional."

  2. Treating the two statements as a single prediction "A, though B" is more ambitious than either, and should be worth as many points as the two of them combined.

Moreover, any partial credit for "A, though B" would take into account that B happened though A didn't. Or rather, a prediction that intelligent roads are only somewhat in use should receive more credit than a prediction that intelligent roads are ubiquitous.

Comment author: TheOtherDave 30 January 2013 05:33:19PM 0 points [-]

Agreed that understanding the "difficulty" of a prediction is key if we're going to evaluate the reliability of a predictor in a useful way.

Comment author: EricHerboso 31 January 2013 08:47:05PM 1 point [-]

In the future, we might distinguish "difficult" predictions from trivial ones by seeing if the predictions are unlike the predictions made by others at the same time. This is easy to do if we evaluate contemporary predictions.

But I have no idea how to accomplish this when looking back on past predictions. I can't help but to feel that some of Kurzweil's predictions are trivial, yet how can we tell for sure?

Comment author: V_V 30 January 2013 05:13:22PM *  0 points [-]

Well, if you analyze the statements in terms of prepositional logic, then all the English language conjunctions "and", "but", "though", etc. map to the only type of logical conjunction ∧.

But natural language is richer than (directly mapped) prepositional logic. I interpret the statement "Local roads, though, are still predominantly conventional." as a clarification of "Intelligent roads are in use, primarily for long-distance travel.".

Formally, if you just claim:
"Intelligent roads are in use, primarily for long-distance travel."
it is equivalent to:
"Intelligent roads are in use, primarily for long-distance travel." ∧ ( "Local roads are still predominantly conventional" ∨ ¬"Local roads are still predominantly conventional" )
which is different from
"Intelligent roads are in use, primarily for long-distance travel." ∧ "Local roads are still predominantly conventional"

However, we can assume that if you claim:
"Intelligent roads are in use, primarily for long-distance travel."
you also wanted to communicate that
"Local roads are still predominantly conventional"
not that you are undecided between
"Local roads are still predominantly conventional", ¬"Local roads are still predominantly conventional"
otherwise you would have probably stated that explicitely.

Therefore, the information content of:
"Intelligent roads are in use, primarily for long-distance travel."
and:
"Intelligent roads are in use, primarily for long-distance travel. Local roads, though, are still predominantly conventional."
is rougly the same.

Comment author: HoverHell 15 January 2013 04:42:29PM *  1 point [-]

these were not binary yes/no predictions

And how it would be most appropriate to correct for that? Normalizing by random on all alternative predictions (that were made or that can be come up with)?

(with non-binary those graphs, as it seems to me, get relatively useless)

Comment author: CarlShulman 16 January 2013 08:14:22PM 3 points [-]

(with non-binary those graphs, as it seems to me, get relatively useless)

They are at least fairly comparable to the format in Kurzweil's self-assessment, and so useful for putting the high accuracy ratings reported there into perspective.

Comment author: AnthonyC 16 January 2013 01:39:16PM 3 points [-]

Estimate the complexity in bits of each prediction? ;)

Comment author: Stuart_Armstrong 24 January 2013 11:18:29AM 1 point [-]

How complex, in bits is: "it will rain in Oxford at some point this year"? Very. And yet, I would hesitate to call that an impressive prediction...