I copy and pasted the "Time To AI" chart and did some simple graphic manipulations to make the vertical and horizontal axis equal, extend the X-axis, and draw diagonal lines "down and to the right" to show which points predicted which dates. It was an even more interesting graphic that way!
It sort of looked like four or five gaussians representing four or five distinct theories were on display. All the early predictions (I assume that first one is Turing himself) go with a sort of "robots by 2000" prediction scheme that seems consistent with the Jetson's and what might have happened without "the great stagnation". All of the espousers of this theory published before the AI winter and you can see a gap in predictions being made on the subject from about 1978 to about 1994. Predicting AGI arrival in 2006 was never trendy, it seems to have always been predicted earlier or later.
The region from 2015 thru 2063 has either one or two groups betting on it because instead of "guassian-ish" it is strongly weighted towards the front end, suggesting perhaps a bimodal group that isn't easy to break into two definite groups. One hump sometimes predicts dates out as late as the 2050's, but the main group really likes the 2020's and 2030's. The first person to express anything like this theory was an expert in about 1979 (before the AI winter really set in, which is interesting) and I'm not sure who it was off the top of my head. There's a massive horde expressing this general theory, but they seem to have come in a wave of non-experts during the dotcom bubble (predicting early-ish) and then there's a gap in the aftermath of the bubble, then a wave of experts predicting a bit later.
Like 2006, the year 2072 is not very trendy for AGI predictions. However around 2080 to 2110 there seems to be a cluster that was lead by three non-expert opinions expressed in 1999 to 2003 (ie the dotcom bubble aftermath). A few years later five experts chime in to affirm the theory. I don't recognize the theory by name or rhetoric but my rough label for their theory might be "the singularity is late" just based on the sparse data.
The final coherent theory seems to be four people predicting "2200", my guess here is just that its really far in the future and a nice round number. Four people do this, two experts and two non-experts. It looks like two pre-bubble and two post bubble?
For what its worth, eyeballing my re-worked "Time to AI" figure indicates a median of about 2035, and my last moderately thoughtful calculation gave a median arrival of AGI at about 2037, with later arrivals being more likely to be "better" and, in the meantime, prevention of major wars or arms races being potentially more important to work on than AGI issues. The proximity of these dates to the year 2038 is pure ironic gravy, though I have always sort of suspected that one chunk of probability mass should take the singularity seriously because if it happens then it will be enormously important, while another chunk of probability mass should be methodologically mindful of the memetic similarities between the Y2K Bug and the Singularity (i.e. both of them being non-supernatural computer-based eschatologies which, whatever their ultimate truth status, would naturally propagate in roughly similar ways before the fact was settled).
How many degrees of freedom does your "composition of N theories" theory have? I'm not inclined to guess, since I don't know how you went about this. I just want to point out that 260 is not many data points; clustering is very likely going to give highly non-reproducible results unless you're very careful.
EDIT: Thanks to Kaj's work, we now have more rigorous evidence on the "Maes-Garreau law" (the idea that people will predict AI coming before they die). This post has been updated with extra information. The original data used for this analysis can now be found through here.
Thanks to some sterling work by Kaj Sotala and others (such as Jonathan Wang and Brian Potter - all paid for by the gracious Singularity Institute, a fine organisation that I recommend everyone look into), we've managed to put together a databases listing all AI predictions that we could find. The list is necessarily incomplete, but we found as much as we could, and collated the data so that we could have an overview of what people have been predicting in the field since Turing.
We retained 257 predictions total, of various quality (in our expanded definition, philosophical arguments such as "computers can't think because they don't have bodies" count as predictions). Of these, 95 could be construed as giving timelines for the creation of human-level AIs. And "construed" is the operative word - very few were in a convenient "By golly, I give a 50% chance that we will have human-level AIs by XXXX" format. Some gave ranges; some were surveys of various experts; some predicted other things (such as child-like AIs, or superintelligent AIs).
Where possible, I collapsed these down to single median estimate, making some somewhat arbitrary choices and judgement calls. When a range was given, I took the mid-point of that range. If a year was given with a 50% likelihood estimate, I took that year. If it was the collection of a variety of expert opinions, I took the prediction of the median expert. If the author predicted some sort of AI by a given date (partial AI or superintelligent AI), I took that date as their estimate rather than trying to correct it in one direction or the other (there were roughly the same number of subhuman AIs as suphuman AIs in the list, and not that many of either). I read extracts of the papers to make judgement calls when interpreting problematic statements like "within thirty years" or "during this century" (is that a range or an end-date?).
So some biases will certainly have crept in during the process. That said, it's still probably the best data we have. So keeping all that in mind, let's have a look at what these guys said (and it was mainly guys).
There are two stereotypes about predictions in AI and similar technologies. The first is the Maes-Garreau law: technologies as supposed to arrive... just within the lifetime of the predictor!
The other stereotype is the informal 20-30 year range for any new technology: the predictor knows the technology isn't immediately available, but puts it in a range where people would still be likely to worry about it. And so the predictor gets kudos for addressing the problem or the potential, and is safely retired by the time it (doesn't) come to pass. Are either of these stereotypes born out by the data? Well, here is a histogram of the various "time to AI" predictions:
As can be seen, the 20-30 year stereotype is not exactly born out - but a 15-25 one would be. Over a third of predictions are in this range. If we ignore predictions more than 75 years into the future, 40% are in the 15-25 range, and 50% are in the 15-30 range.
Apart from that, there is a gradual tapering off, a slight increase at 50 years, and twelve predictions beyond three quarters of a century. Eyeballing this, there doesn't seem to much evidence for the Maes-Garreau law. Kaj looked into this specifically, plotting (life expectancy) minus (time to AI) versus the age of the predictor; the Maes-Garreau law would expect the data to be clustered around the zero line:
Most of the data seems to be decades out from the zero point (note the scale on the y axis). You could argue, possibly, that fifty year olds are more likely to predict AI just within their lifetime, but this is a very weak effect. I see no evidence for the Maes-Garreau law - of the 37 prediction Kaj retained, only 6 predictions (16%) were within five years (in either direction) of the expected death date.
But not all predictions are created equal. 62 of the predictors were labelled "experts" in the analysis - these had some degree of expertise in fields that were relevant to AI. The other 33 were amateurs - journalists, writers and such. Decomposing into these two groups showed very little difference, though:
The only noticeable difference is that amateurs lacked the upswing at 50 years, and were relatively more likely to push their predictions beyond 75 years. This does not look like good news for the experts - if their performance can't be distinguished from amateurs, what contributions is their expertise making?
But I've been remiss so far - combining predictions that we know are false (because their deadline has come and gone) with those that could still be true. If we look at predictions that have failed, we get this interesting graph:
This looks very similar to the original graph. The main difference being the lack of very long range predictions. This is not, in fact, because there has not yet been enough time for these predictions to be proved false, but because prior to the 1990s, there were actually no predictions with a timeline greater than fifty years. This can best be seen on this scatter plot, which plots the time predicted to AI against the date the prediction was made:
As can be seen, as time elapses, people become more willing to predict very long ranges. But this is something of an artefact - in the early days of computing, people were very willing to predict that AI was impossible. Since this didn't give a timeline, their "predictions" didn't show up on the graph. It recent times, people seem a little less likely to claim AI is impossible, replaced by these "in a century or two" timelines.
Apart from that one difference, predictions look remarkably consistent over the span: modern predictors are claiming about the same time will elapse before AI arrives as their (incorrect) predecessors. This doesn't mean that the modern experts are wrong - maybe AI really is imminent this time round, maybe modern experts have more information and are making more finely calibrated guesses. But in a field like AI prediction, where experts lack feed back for their pronouncements, we should expect them to perform poorly, and for biases to dominate their thinking. This seems the likely hypothesis - it would be extraordinarily unlikely that modern experts, free of biases and full of good information, would reach exactly the same prediction distribution as their biased and incorrect predecessors.
In summary: