Meta: Strong upvote for pulling a specific mistake out and correcting it; this is a good method because in such a high-activity post it would be easy for the discussion to get lost in the comments (especially in the presence of other wrong criticisms).
That being said, I disagree with your recommendation against inclusion in the 2019 review for two reasons:
I am heartened to hear this. I do agree that the core claim of the essay is not invalidated -- "Soft takeoff can still lead to DSA." However, I do think the core argument of the essay has been overturned, such that it leads to something close to the opposite conclusion: There is a strong automatic force that works to make DSA unlikely to the extent that takeoff is distributed (and distributed = a big part of what it means to be soft, I think).
Basically, I think that if I were to rewrite this post to fix what I now think are errors and give what I now think is the correct view, including uncertainties, it would be a completely different post. In fact, it would be basically this review post that I just wrote! (Well, that plus the arguments for steps 2 and 4 from the original post, which I still stand by.) I guess I'd be happy to do that if that's what people want.
the opposite conclusion: There is a strong automatic force that works to make DSA unlikely to the extent that takeoff is distributed
By the standards of inclusion, I feel like this is an even better contribution! My mastery of our corpus is hardly complete, but it appears to me until you picked up this line of inquiry deeper interrogation of the circumstances surrounding DSA was sorely lacking on LessWrong. Being able to make more specific claims about causal mechanisms is huge.
I propose a different framing than opposite conclusion: rather you are suggesting some causal mechanism for why a slow takeoff DSA is different in character from FOOM with fewer gigahertz.
I am going to add a review on the original post so this conversation doesn't get missed in the voting phase.
Currently the most plausible doom scenario in my mind is maybe a version of Paul’s Type II failure. (If this is surprising to you, reread it while asking yourself what terms like “correlated automation failure” are euphemisms for.)
This is interesting, and I'd like to see you expand on this. Incidentally I agree with the statement, but I can imagine both more and less explosive, catastrophic versions of 'correlated automation failure'. On the one hand it makes me think of things like transportation and electricity going haywire, on the other it could fit a scenario where a collection of powerful AI systems simultaneously intentionally wipe out humanity.
Clock-time leads shrink automatically as the pace of innovation speeds up, because if everyone is innovating 10x faster, then you need 10x as many hoarded ideas to have an N-year lead.
What if, as a general fact, some kinds of progress (the technological kinds more closely correlated with AI) are just much more susceptible to speed-up? I.e, what if 'the economic doubling time' stops being so meaningful - technological progress speeds up abruptly but other kinds of progress that adapt to tech progress have more of a lag before the increased technological progress also affects them? In that case, if the parts of overall progress that affect the likelihood of leaks, theft and spying aren't sped up by as much as the rate of actual technology progress, the likelihood of DSA could rise to be quite high compared to previous accelerations where the order of magnitude where the speed-up occurred was fast enough to allow society to 'speed up' the same way.
In other words - it becomes easier to hoard more and more ideas if the ability to hoard ideas is roughly constant but the pace of progress increases. Since a lot of these 'technologies' for facilitating leaks and spying are more in the social realm, this seems plausible.
But if you need to generate more ideas, this might just mean that if you have a very large initial lead, you can turn it into a DSA, which you still seem to agree with:
- Even if takeoff takes several years it could be unevenly distributed such that (for example) 30% of the strategically relevant research progress happens in a single corporation. I think 30% of the strategically relevant research happening in a single corporation at beginning of a multi-year takeoff would probably be enough for DSA.
Sorry it took me so long to reply; this comment slipped off my radar.
The latter scenario is more what I have in mind--powerful AI systems deciding that now's the time to defect, to join together into a new coalition in which AIs call the shots instead of humans. It sounds silly, but it's most accurate to describe in classic political terms: Powerful AI systems launch a coup/revolution to overturn the old order and create a new one that is better by their lights.
I agree with your argument about likelihood of DSA being higher compared to previous accelerations, due to society not being able to speed up as fast as the technology. This is sorta what I had in mind with my original argument for DSA; I was thinking that leaks/spying/etc. would not speed up nearly as fast as the relevant AI tech speeds up.
Now I think this will definitely be a factor but it's unclear whether it's enough to overcome the automatic slowdown. I do at least feel comfortable predicting that DSA is more likely this time around than it was in the past... probably.
I agree with your argument about likelihood of DSA being higher compared to previous accelerations, due to society not being able to speed up as fast as the technology. This is sorta what I had in mind with my original argument for DSA; I was thinking that leaks/spying/etc. would not speed up nearly as fast as the relevant AI tech speeds up.
Your post on 'against GDP as a metric' argues more forcefully for the same thing that I was arguing for, that
'the economic doubling time' stops being so meaningful - technological progress speeds up abruptly but other kinds of progress that adapt to tech progress have more of a lag before the increased technological progress also affects them?
So we're on the same page there that it's not likely that 'the economic doubling time' captures everything that's going on all that well, which leads to another problem - how do we predict what level of capability is necessary for a transformative AI to obtain a DSA (or reach the PONR for a DSA)?
I notice that in your post you don't propose an alternative metric to GDP, which is fair enough since most of your arguments seem to lead to the conclusion that it's almost impossibly difficult to predict in advance what level of advantage over the rest of the world in which areas are actually needed to conquer the world, since we seem to be able to analogize persuasion tools to or conquistador-analogues who had relatively small tech advantages, to the AGI situation.
I think that there is still a useful role for raw economic power measurements, in that they provide a sort of upper bound on how much capability difference is needed to conquer the world. If an AGI acquires resources equivalent to controlling >50% of the world's entire GDP, it can probably take over the world if it goes for the maximally brute force approach of just using direct military force. Presumably the PONR for that situation would be awhile before then, but at least we know that an advantage of a certain size would be big enough given no assumptions about the effectiveness of unproven technologies of persuasion or manipulation or specific vulnerabilities in human civilization.
So we can use our estimate of how doubling time may increase, anchor on that gap and estimate down based on how soon we think the PONR is, or how many 'cheat' pathways that don't involve economic growth there are.
The whole idea of using brute economic advantage as an upper limit 'anchor' I got from Ajeya's Post about using biological anchors to forecast what's required for TAI - if we could find a reasonable lower bound for the amount of advantage needed to attain DSA we could do the same kind of estimated distribution between them. We would just need a lower limit - maybe there's a way of estimating it based on the upper limit of human ability since we know no actually existing human has used persuasion to take over the world but as you point out they've come relatively close.
I realize that's not a great method, but is there any better alternative given that this is a situation we've never encountered before, for trying to predict what level of capability is necessary for DSA? Or perhaps you just think that anchoring your prior estimate based on economic power advantage as an upper bound is so misleading it's worse than having a completely ignorant prior. In that case, we might have to say that there are just so many unprecedented ways that a transformative AI could obtain a DSA that we can just have no idea in advance what capability is needed, which doesn't feel quite right to me.
I notice that in your post you don't propose an alternative metric to GDP, which is fair enough since most of your arguments seem to lead to the conclusion that it's almost impossibly difficult to predict in advance what level of advantage over the rest of the world in which areas are actually needed to conquer the world, since we seem to be able to analogize persuasion tools to or conquistador-analogues who had relatively small tech advantages, to the AGI situation.
I wouldn't go that far. The reason I didn't propose an alternative metric to GDP was that I didn't have a great one in mind and the post was plenty long enough already. I agree that it's not obvious a good metric exists, but I'm optimistic that we can at least make progress by thinking more. For example, we could start by enumerating different kinds of skills (and combos of skills) that could potentially lead to a PONR if some faction or AIs generally had enough of them relative to everyone else. (I sorta start such a list in the post). Next, we separately consider each skill and come up with a metric for it.
I'm not sure I understand your proposed methodology fully. Are you proposing we do something like Roodman's model to forecast TAI and then adjust downwards based on how we think PONR could come sooner? I think unfortunately that GWP growth can't be forecast that accurately, since it depends on AI capabilities increases.
Clock-time leads shrink automatically as the pace of innovation speeds up, because if everyone is innovating 10x faster, then you need 10x as many hoarded ideas to have an N-year lead.
Have you considered how this model changes under the low hanging fruit hypothesis, by which I mean more advanced ideas in a domain are more difficult and time consuming to discover than the less advanced ones? My reasoning for why it matters:
Now this doesn't actually change the underlying intuition of a time advantage very much; mostly I just expect that the '10x faster innovation' component of the example will be deeply discontinuous. This leads naturally to thinking about things like a broad DSA, which might consist of a systematic advantage across capabilities, versus a tall DSA, which would be more like an overwhelming advantage in a single, high import capability.
I haven't specifically tried to model the low hanging fruit hypothesis, but I do believe the hypothesis and so it probably doesn't contradict the model strongly. I don't quite follow your reasoning though--how does the hypothesis make discontinuities more likely? Can you elaborate?
Sure!
I have a few implicit assumptions that affect my thinking:
The real work is being done by an additional two assumptions:
So under my model, the core mechanism of differentiation is that developing an insurmountable single capability advantage competes with rapid gains in a different capability (or line of ideas), which includes innovation capacity. Further, different lines of ideas and capabilities will have different development speeds.
Now a lot of this differentiation collapses when we get more specific about what we are comparing, like if we choose Google, Facebook and Microsoft on the single capability of Deep Learning. It is worth considering that software has an unusually cheap transfer of ideas to capability, which is the crux of why AI weighs so heavily as a concern. But this is unique to software for now, and in order to be a strategic threat it has to cash out in non-software capability eventually, so keeping the others in mind feels important.
OK, so if I'm getting this correctly, the idea is that there are different capabilities, and the low hanging fruit hypothesis applies separately to each one, and not all capabilities are being pursued successfully at all times, so when a new capability starts being pursued successfully there is a burst of rapid progress as low-hanging fruit is picked. Thus, progress should proceed jumpily, with some capabilities stagnant or nonexistent for a while and then quickly becoming great and then levelling off. Is this what you have in mind?
That is correct. And since different players start with different capabilities and are in different local environments under the soft takeoff assumption, I really can't imagine a scenario where everyone winds up in the same place (or even tries to get there - I strongly expect optimizing for different capabilities depending on the environment, too).
OK, I think I agree with this picture to some extent. It's just that if things like taking over the world require lots of different capabilities, maybe jumpy progress in specific capabilities distributed unevenly across factions all sorta averages out thanks to law of large numbers into smooth progress in world-takeover-ability distributed mostly evenly across factions.
Or not. Idk. I think this is an important variable to model and forecast, thanks for bringing it up!
But realistically not all projects will hoard all their ideas. Suppose instead that for the leading project, 10% of their new ideas are discovered in-house, and 90% come from publicly available discoveries accessible to all. Then, to continue the car analogy, it’s as if 90% of the lead car’s acceleration comes from a strong wind that blows on both cars equally. The lead of the first car/project will lengthen slightly when measured by distance/ideas, but shrink dramatically when measured by clock time.
The upshot is that we should return to that table of factors and add a big one to the left-hand column: Leads shorten automatically as general progress speeds up, so if the lead project produces only a small fraction of the general progress, maintaining a 3-year lead throughout a soft takeoff is (all else equal) almost as hard as growing a 3-year lead into a 30-year lead during the 20th century. In order to overcome this, the factors on the right would need to be very strong indeed.
But won't "ability to get a DSA" be linked to the lead as measured in ideas rather than clock time?
Maybe. My model was a bit janky; I basically assume DSA-ability comes from clock-time lead but then also assumed that as technology and progress speed up the necessary clock-time lead shrinks. And I guesstimated that it would shrink to 0.3 - 3 years. I bet there's a better way, that pegs DSA-ability to ideas lead... it would be a super cool confirmation of this better model if we could somehow find data confirming that years-needed-for-DSA has fallen in lockstep as ideas-produced-per-year has risen.
A few months after writing this post I realized that one of the key arguments was importantly flawed. I therefore recommend against inclusion in the 2019 review. This post presents an improved version of the original argument, explains the flaw, and then updates my all-things-considered view accordingly.
Improved version of my original argument
I take it that 2, 4, and 5 are the controversial bits. I still stand by 2, and the arguments made for it in my original post. I also stand by 4. (To be clear, it’s not like I’ve investigated these things in detail. I’ve just thought about them for a bit and convinced myself that they are probably right, and I haven’t encountered any convincing counterarguments so far.)
5 is where I made a big mistake.
(Comments on my original post also attacked 5 a lot, but none of them caught the mistake as far as I can tell.)
My big mistake
Basically, my mistake was to conflate leads measured in number-of-hoarded-ideas with leads measured in clock time. Clock-time leads shrink automatically as the pace of innovation speeds up, because if everyone is innovating 10x faster, then you need 10x as many hoarded ideas to have an N-year lead.
Here’s a toy model, based on the one I gave in the original post:
There are some projects/factions. There are many ideas. Projects can have access to ideas. Projects make progress, in the form of discovering (gaining access to) ideas. For each idea they access, they can decide to hoard or not-hoard it. If they don’t hoard it, it becomes accessible to all. Hoarded ideas are only accessible by the project that discovered them (though other projects can independently rediscover them). The rate of progress of a project is proportional to how many ideas they can access.
Let’s distinguish two ways to operationalize the technological lead of a project. One is to measure it in ideas, e.g. “Project X has 100 hoarded ideas and project Y has only 10, so Project X is 90 ideas ahead.” But another way is to measure it in clock time, e.g. “It’ll take 3 years for project Y to have access to as many ideas as project X has now.”
Suppose that all projects hoard all their ideas. Then the ideas-lead of the leading project will tend to lengthen: the project begins with more ideas, so it makes faster progress, so it adds new ideas to its hoard faster than others can add new ideas to theirs. However, the clocktime-lead of the leading project will remain fixed. It’s like two identical cars accelerating one after the other on an on-ramp to a highway: the distance between them increases, but if one entered the ramp three seconds ahead, it will still be three seconds ahead when they are on the highway.
But realistically not all projects will hoard all their ideas. Suppose instead that for the leading project, 10% of their new ideas are discovered in-house, and 90% come from publicly available discoveries accessible to all. Then, to continue the car analogy, it’s as if 90% of the lead car’s acceleration comes from a strong wind that blows on both cars equally. The lead of the first car/project will lengthen slightly when measured by distance/ideas, but shrink dramatically when measured by clock time.
The upshot is that we should return to that table of factors and add a big one to the left-hand column: Leads shorten automatically as general progress speeds up, so if the lead project produces only a small fraction of the general progress, maintaining a 3-year lead throughout a soft takeoff is (all else equal) almost as hard as growing a 3-year lead into a 30-year lead during the 20th century. In order to overcome this, the factors on the right would need to be very strong indeed.
Conclusions
My original argument was wrong. I stand by points 2 and 4 though, and by the subsequent posts I made in this sequence. I notice I am confused, perhaps by a seeming contradiction between my explicit model here and my take on history, which is that rapid takeovers and upsets in the balance of power have happened many times, that power has become more and more concentrated over time, and that there are not-so-distant possible worlds in which a single man rules the whole world sometime in the 20th century. Some threads to pull on:
Thanks to Jacob Laggeros for nudging me to review my post and finally get all this off my chest. And double thanks to all the people who commented on the original post!