Dan Luu on Futurist Predictions

I bought a subscription and tracked down the offending LW comment:

Another fundamental issue with the analysis is that it relies on aggregating votes of a kind from Less Wrong readers and the associated community. As we discussed here, it's common to see the most upvoted comments in forums like HN, lobsters, LW, etc., be statements that can clearly be seen to be wrong with no specialized knowledge and a few seconds of though (and an example is given from LW in the link), so why should an aggregation of votes from the LW community be considered meaningful?

He doesn't actually give a link to a LW comment, but he describes it. He says Jeff Kaufman asked why there are so few 6-door cars, and the top comment said that doors are an expensive part of the car, but this is obviously false because (a) they can't be thousands of dollars each and that's what it would take to make them a noticeable fraction of the cost, and (b) if that were true we'd expect cheap cars to more often have two doors instead of four, but instead cheap cars usually have four doors and if anything it's the expensive sports cars that have two.

Tracking down the original post, it appears to be this one. Top comment is roughly as described. It has 5 upvotes.

What's my overall take? Well, I don't think the explanation is as obviously false as Dan Luu thinks. But I do agree that a and b are good objections to it.

[-]gjm3y155

It's not obvious to me that (a,b) are such good objections, and furthermore the comment doesn't just say "it's because doors are expensive".

The comment in question says: doors cost money and worsen crash safety, so cars with more doors would cost more, and most buyers don't value the extra convenience enough to justify what it would cost them, so the market would be small which would make it not worth the development cost.

I agree that it's unlikely that the cost of an extra pair of doors is multiple thousands of dollars per car. But the price is, let's say, 3x the cost, and I don't have any trouble believing that an extra pair of doors might increase the cost by say $700, meaning a price $2000 higher. Is this "clearly seen to be wrong with no specialized knowledge"? Doesn't seem so. So, then the question is how willing buyers with large families would be to pay an extra $2000 for the convenience of a third door. Again, it's not clear to me that this wouldn't hurt the sellability of the vehicle. The idea that making such a vehicle as safe as everyone expects these days could be difficult also seems plausible to me. I would expect that families with multiple smallish children are (1) the main potential market for 6-door cars, (2) very safety-conscious, and (3) often very price-conscious.

I guess point (b) is meant to undermine the idea that adding doors increases the cost at all, or something. I'm not convinced by this. Isn't it plausible that the gain in convenience in going from 2 to 4 doors is substantially bigger than the gain in going from 4 to 6? Or that the loss in safety in going from 2 to 4 is smaller than going from 4 to 6? ... And some cheap-and-nasty cars have had only two doors. For instance, if I try to think of a cheap and crappy car, the first thing that comes to mind is the old Reliant Robin: four seats, two doors. (Also, three wheels.)

i don't know whether that comment is right. But I don't see how danluu reckons it's obviously wrong with a few seconds of thought. I wonder whether danluu might change his mind about its obvious wrongness if he thought about it for more than a few seconds.

[-]Daniel Kokotajlo3y113

Oh yeah, I should have clarified that I agree with your take -- Dan seems totally wrong to take this comment as significant negative evidence about the epistemic standards of LW. Waaaay too big of a stretch. It's just one comment with 5 upvotes, plus it's not obviously wrong & may even be right.

[-]lc3y*30

Credit where it is due. It's a good article.

[-]Multicore3y10

On the first read I was annoyed at the post for criticizing futurists for being too certain in their predictions, while it also throws out and refuses to grade any prediction that expressed uncertainty, on the grounds that saying something "may" happen is unfalsifiable.

On reflection these two things seem mostly unrelated, and for the purpose of establishing a track record "may" predictions do seem strictly worse than either predicting confidently (which allows scoring % of predictions right), or predicting with a probability (which none of these futurists did, but allows creating a calibration curve).

[-]ersatz3y10

An interesting section in the appendices, a criticism of Ajeya Cotra’s “Forecasting Transformative AI with Biological Anchors”:

If you do a sensitivity analysis on the most important variable (how much Moore's law will improve FLOPS/$), the output behavior doesn't make any sense, e.g., Moore's law running out of steam after "conventional" improvements give us a 144x improvement would give us a 34% chance of transformative AI (TAI) by 2100, a 144*6x increase gives a 52% chance, and a 144*600x increase gives a 66% chance (and with the predicted 60000x improvement, there's a 78% chance), so the model is, at best, highly flawed unless you believe that going form a 144x improvement to a 144*6x improvement in computer cost gives almost as much increase in the probability of TAI as a 144*6x to 144*60000x improvement in computer cost.
The part about all of this that makes this fundamentally the same thing that the futurists here did is that the estimate of the FLOPS/$ which is instrumental for this prediction is pulled from thin air by someone who is not a deep expert in semiconductors, computer architecture, or a related field that might inform this estimate.
[...]
If you say that, based on your intuition, you think there's some significant probability of TAI by 2100; 10% or 50% or 80% or whatever number you want, I'd say that sounds plausible but wouldn't place any particular faith in the estimate. But if you take a model that produces nonsense results and then pick an arbitrary input to the model that you have no good intuition about to arrive at an 80% chance, you've basically picked a random number that happens to be 80%.

[-]paulfchristiano3y*288

The claim that the probability goes from 34% -> 52% from a 6x of compute does sound pretty weird! But I think it's just based on a game of telephone and a complete misunderstanding.

I was initially confused where the number came from, then I saw the reference to Nostalgebraist's post. They say that "Assume a 6x extra speedup, and you get a 52% chance. (Which is still pretty high, to be fair.) Assume no extra speedup, and also no speedup at all, just the same computers we have now, and you get a 34% chance … wait, what?!"

Nostalgebraist is saying that you move from 34% to 52% by moving from 1x and 144*6x---not by moving from 144x to 144*6x. That is, this if you increase compute by about 3 OOMs you increase the probability from 34% to 52%.

Similarly, if you increase probability by 14*6x to 144*60000x, or about 4OOMs, you increase probability from 52% to 78%.

So 3 OOMs is 18% and 4 OOMs is 26%, roughly proportional as you'd expect given the nature of the model. The report basically distributes TAI over 20 OOMs and so a 3 OOM increase covers about 3/20th of the range.

But if you take a model that produces nonsense results and then pick an arbitrary input to the model that you have no good intuition about to arrive at an 80% chance, you've basically picked a random number that happens to be 80%.

If you get a nonsensical number out of a model, I think it's worth reflecting more on whether there was a misunderstanding.

Aside from this, calling "how far does Moore's law go" the most important variable seems kind of overstated. The criticism is that 7 orders of magnitude in this parameter leads to a change from 34% to 78%. I agree that's a significant difference, but 7 orders of magnitude is a lot of uncertainty in this parameter, and I don't think that's grounds for saying that it's the number that drives the whole estimate. And even after 7 OOMs these estimates aren't even that different in an action-relevant way---in particular this change doesn't result in a similarly-dramatic change for your 10 year or 20 year timelines, and shifting your 100 year TAI probability from 55% to 78% is not a huge deal.

And aside from that, saying that the estimates for Moore's law are arbitrary isn't right. I think it's totally fair that Ajeya isn't an expert, but that doesn't mean that things are totally unknown within 7 orders of magnitude. At the upper end things are pretty constrained by basic physics, at the lower end things are pretty constrained by normal technological extrapolation. There's a ton of uncertainty left but it's just not a big deal relative to the uncertainty about AI training.

The overall estimate is basically driven by the fact that a broad distribution over horizon lengths in the existing NN extrapolation gives you a similar range of estimates to the entire space from human lifetime to human evolution. So it's very easy to squint and get a broad distribution with around 5% probability per OOM of compute (which is a couple percent per year right now). The criticism of this that seems most plausible to me is that maybe inside-view you can just eyeball how good AI systems are and how close they are to transformative effects and it's just not that far. That said, the second most plausible criticism (especially about the 20%+ short-term predictions) is that you can eyeball how good AI systems are and it's probably not that close.

(Disclaimer: this report is written by my wife and so I may be biased.)

[-]Daniel Kokotajlo3y80

FWIW I'm not married to Ajeya and I agree with you; I was pretty disappointed by Nostalgebraist's post & how much positive reception it seemed to get. I've been thinking about writing up a rebuttal. Most of what I'd say is what you've already said here though, so yay.

[-]ESRogs3y80

I pointed out the OOM error on Twitter (citing this comment), and Dan has updated the post with a correction.

^{^}

Someone in 1960 predicting an unlikely outcome in 2010, and that outcome actually occuring in 2011, is technically "wrong", but a very different kind of wrong from someone in 1960 predicting an unlikely outcome in 1965 but that outcome not having occurred yet at all.

^{^}

Other people's evaluations of predictions are not, in fact, especially solid pointers to the truth-value of those predictions. Given the subject of the article I think Dan probably appreciates this point.

^{^}

Three of the basic aggregations suggested by Dan as being even minimally informative.

^{^}

The fact that the modal grade for question 47 (about cochlear implants) was a 1, while the mean was ~2.5 (with many 4s and 5s), is mostly an indication that the prediction was underspecified, and that the graders in question had very different ideas in mind of what "very effective" and "widely used" meant (or had similar ideas but didn't bother looking up the actual numbers).

^{^}

In the sense that most people were not making similar predictions at the time, and priors on those predictions were probably low.

^{^}

Quote: "I find it a bit odd that, with all of the commentary of these LW posts, few people spent the one minute (and I mean one minute literally — it took me a minute to read the post, see the comment Armstrong made which is a red flag, and then look at the raw data) it would take to look at the data and understand what the post is actually saying, but as we've noted previously, almost no one actually reads what they're citing."

^{^}

For reasons that at least aren't obviously wrong, though I think foregoing an inside-view opinion while simultaneously delivering an outside-view refutation is not enormously productive.

MEAN < 2	8
MEAN < 1.5	3
MEDIAN == 1	6
MODE == 1	14

LESSWRONG
is fundraising!
LW

LESSWRONG
is fundraising!
LW

51

Dan Luu on Futurist Predictions

51

51