Vote via reactions agree or disagree (or unsure etc) to the following proposition: This post should also go on my Substack.
EDIT: Note that this is currently +5 agreement, but no one actually used a reaction (the icons available at the bottom right corner). Please use the reactions instead, this is much more useful than the +/-.
This seems like highly relevant (even if odd/disconcerting) information. I'm not sure if it should necessarily get it's own post (is this as important than the UK AI Summit or the Executive Order?), but it should certainly gets a top level item in your next roundup at least.
"It is possible that there is, in practice, no middle path. That our only three available choices, as a planet, are ‘build AGI almost as fast as possible, assume alignment is easy on the first try and that the dynamics that arise after solving alignment can be solved before catastrophe as well,’ ‘build AGI as fast as possible knowing we will likely die because AGI replacing humans is good actually’ or ‘never build a machine in the image of a human mind.’"
I think we are largely in agreement, except that I think this scenario is by far the likeliest. Whether or not a middle path exists in theory, I see no way to it in practice. I don't know what level of risk justifies outright agitation for Butlerian jihad, smash all the microchips and execute anyone who tries to make them, but it's a a good deal lower than 50%.
There is a good reminder at the beginning that existential risk is about what happens eventually, not about scale of catastrophes. So for example a synthetic pandemic that kills 99% of the population doesn't by itself fall under existential risk, since all else equal recovery is possible, and it's unclear how the precedent of such catastrophe moves existential risk going forward. But a permanent AI-tool-enforced anti-superintelligence regime without any catastrophe does fit the concern of existential risk.
The alternative to the position of high probability of AI killing everyone for which there exist plausible arguments involves AI sparing everyone. This is still an example of existential risk, since humanity doesn't get the future, we only get its tiny corner the superintelligences allocate to our welfare. In this technical sense of existential risk, it's coherent to hold the position where simultaneously the chance of existential risk doom is 90%, while the chance of human extinction is only 30%.
We will hopefully be fine either way, but I think I would like the AI before some radical biotech revolution. If you think about it, if you first get some sort of super-advanced synthetic biology, that might kill us. But if we're lucky, we survive it. Then, maybe you invent some super-advanced molecular nanotechnology, that might kill us, but if we're lucky we survive that. And then you do the AI. Then, maybe that will kill us, or if we're lucky we survive that and then we get to utopia.
Well, then you have to get through sort of three separate existential risks--first the biotech risks, plus the nanotech risks, plus the AI risks, whereas if we get AI first, maybe that will kill us, but if not, we get through that, then I think that will handle the biotech and nanotech risks, and so the total amount of existential risk on that second trajectory would sort of be less than on the former.
I see the optimal trajectory as us going through pretty much ANY other "radical" revolutions before AI, with maybe the exception of uploading or radical human enhancement. All the 'radical revolutions' I can imagine aren't the phase shift of AGI. These seem like more akin to "amped up" versions of revolutions we've already gone through and so in some sense more "similar" and "safer" than what AGI would do. Thus I think these are better practice for us as a society...
On a different note, being overcautious vs undercautious is super easy. We REALLY want to overshoot, than undershoot. If we overshoot, we have a thousand years to correct that. If we undershoot and fail at alignment, we all die and there's no correcting that... We have seen so many social shifts over the last 100 years, there's little reason to believe we'd be 'stuck' without AGI forever. It's not a zero chance, but it certainly seems way lower than AGI being unaligned.
>The fourth thing Bostrom says is that we will eventually face other existential risks, and AGI could help prevent them. No argument here, I hope everyone agrees, and that we are fully talking price.
>It is not sufficient to choose the ‘right level of concern about AI’ by turning the dial of progress. If we turn it too far down, we probably get ourselves killed. If we turn it too far up, it might be a long time before we ever build AGI, and we could lose out on a lot of mundane utility, face a declining economy and be vulnerable over time to other existential and catastrophic risks.
I feel that it's worth pointing out that for almost all X-risks other than AI, while AI could solve them, there are also other ways to solve them that are not in and of themselves X-risks and thus when talking price, only the marginal gain from using AI should be considered.
In particular, your classic "offworld colonies" solve most of the risks. There are two classes of thing where this is not foolproof:
Still, your grey-goo problem and your pandemic problem are fixed, which makes the X-risk "price" of not doing AI a lot less than it might look.
I don't thing all AI regulation harmful, but I think almost all "advocacy" is harmful. Increasing the salience of AI Doom is going to mean making it a partisan issue. For the moment, this means that the left is going to want to regulate AI bias and the right is going to want to build AI faster than China.
I think the correct approach is more akin to secret congress, the idea that bipartisan deals are possible by basically doing things everyone agrees on without publicly broadcasting it.
There is also the possibility of the parties competing over it to avoid looking "soft on AI", which is of course the ideal.
To the extent that AI X-risk has the potential to become partisan, my general impression is that the more likely split is Yuddite-right vs. technophile-left. Note that it was a Fox News reporter who put the question to the White House Press Secretary following Eliezer's TIME article, and a Republican (John Kennedy) who talked about X-risk in the Senate hearing in May, while the Blue-Tribe thinkpieces typically take pains to note that they think X-risk is science fiction.
As a perennial nuclear worrier, I should mention that while any partisan split is non-ideal, this one's probably preferable to the reverse insofar as a near-term nuclear war would mean the culture war ends in Red Tribe victory.
[Editor’s Note: This post is split off from AI #38 and only on LessWrong because I want to avoid overloading my general readers with this sort of thing at this time, and also I think it is potentially important we have a link available. I plan to link to it from there with a short summary.]
Nick Bostrom was interviewed on a wide variety of questions on UnHerd, primarily on existential risk and AI, I found it thoughtful throughout. In it, he spent the first 80% of the time talking about existential risk. Then in the last 20% he expressed the concern that it was unlikely but possible we would overshoot our concerns about AI and never build AGI at all, which would be a tragedy.
How did those who would dismiss AI risk and build AGI as fast as possible react?
About how you would expect. This is from a Marginal Revolution links post.
The next link in that post was to the GPT-infused version of Rohit Krishnan's book about AI, entitled Creating God (should I read it?).
What exactly changed? Tyler links to an extended tweet from Jordan Chase-Young, mostly a transcript from the video, with a short introduction.
In other words, Nick Bostrom previously focused on the fact that AI might kill everyone, thought that was bad actually, and attempted to prevent it. But now the claim is that Bostrom regrets this - he repented.
The context is that Peter Thiel, who warns that those warning about existential risk have gone crazy, has previously on multiple occasions referred seemingly without irony to Nick Bostrom as the Antichrist. So perhaps now Peter and others who agree will revise their views? And indeed, there was much ‘one of us’ talk.
Frequently those who warn of existential risk from AI are told they are saying something religious, are part of a cult, or are pattern matching to the Christian apocalypse, usually as justification for dismissing our concerns without argument.
The recent exception on the other side that proves the rule was Byrne Hobart, author of the excellent blog The Diff, who unlike most concerned about existential risk is explicitly religious and gave a talk about this at a religious conference. Then Dr. Jonathan Askonas, who gave a talk as well, notes he is an optimist skeptical of AI existential risk, and also draws the parallels, and talks about ‘the rationality of the Antichrist’s agenda.’
Note who actually uses such language, and both the symmetries and asymmetries.
Was Jordan’s statement a fair description of what was said by Bostrom?
Mu. Both yes and no would be misleading answers.
His statement is constructed so as to imply something stronger than is present. I would not go so far as to call it ‘lying’ but I understand why so many responses labeled it that. I would instead call the description highly misleading, especially in light of the rest of the podcast and sensible outside context. But yes, Under the rules of Bounded Distrust, this is a legal move one can make, based on the text quoted. You are allowed to be this level of misleading. And I thank him for providing the extended transcript.
Similarly and reacting to Jordan, here is Louis Anslow saying Bostrom has ‘broken ranks,’ and otherwise doing his best to provide a maximally sensationalist reading (scare words in bold red!) while staying within the Bounded Distrust rules. Who are the fearmongers, again?
Jordan Chase-Young then quotes at length from the interview, bold is his everywhere.
To avoid any confusion, and because it was a thoughtful discussion worth reading, I will quote the whole section he quoted, and recommend those interested read (or listen to) the whole thing.
I will also note that the chosen title for this talk, ‘Nick Bostrom: How AI Will Lead to Tyranny,’ seems to go off the rails in the other direction and very much not a central description, while again also being something he does mention as a possibility. There is a section where Bostrom discusses that AI could enshrine permanent power structures including a tyranny, and make surveillance more effective, but he is not saying it will lead to tyranny nor is that discussion central to the interview.
Headlines are often atrocious even when the article or discussion is quite good.
What Bostrom Centrally Said Was Mostly Not New or Controversial
Nick Bostrom says four central things in the quoted text.
Only the third point is not common among the people I know concerned about existential risk. As stated by Bostrom I think you’d get near universal agreement on #4 - I’d go so far as to say that those who don’t agree aren’t thinking reasonably about this. #1 and #4 are periodically affirmed by for example Eliezer Yudkowsky and the other primaries at MIRI, and I can’t actually think of anyone who explicitly disagrees with either proposition. In case I haven’t done so recently enough, I affirm them.
The point #2 is a matter of probability. Bostrom would never have said 0% here, nor would anyone else thinking clearly, although you might think - and I very much do think - that if the choices are ‘never build a machine in the image of a human mind’ and ‘chances are very high that soon never will a human again exist in this universe’ I am going to pick Box A.
Indeed, until very recently, talk was more ‘why are you trying to prevent us from building AGI as quickly as possible, that’s impossible,’ which is a strictly easier task than never building it ever, and people like Yudkowsky going ‘yeah, looks very hard, we look pretty doomed on that one, but going to try anyway.’ Many, including Tyler Cowen, have essentially argued that preventing AGI for very long is impossible, the incentives work too strongly against it.
And I continue to think that is a highly reasonable position, that might well be true. Getting everyone to pause seems incredibly hard, and maintaining that indefinitely seems also incredibly hard. But yes, some probability. More than 1%. If we presume that AGI is not so technically difficult, and that we don’t otherwise blow up our civilization for at least a few decades, I’d say less than 10%.
Responses Confirming Many Concerned About Existential Risk Mostly Agree
Here’s one response.
Here’s Dan Elton, who is in a similar place to Bostrom.
More than that, we highly value voicing disagreement on such questions.
To those who would weaponize such statements as Bostrom’s, rather than join into dialogue with them, I would say: You are not making this easy.
My guess is that the crux of the disagreement between Bostrom and Bensinger, in which I mostly agree with Bensinger, is a disagreement about the necessary level of concern to get the proper precautions actually taken. Bostrom says somewhat higher, Bensinger would say much higher and much more precise. This is based most importantly on differences in how hard it will be to actually stop AI development, secondarily on Bensinger having a very high p(doom | AGI soon).
There is also a disagreement I model as being caused by Bostrom being a philosopher used to thinking in terms of lock-in and permanent equilibria - he thinks we might lock ourselves into no AGI ever through fear, and it could well stick indefinitely. I see a lot of similar cultural lock-in arguments in other longtermist philosophy (e.g. Toby Ord and Will MacAskill) and I am skeptical of such long-term path dependence more generally. It also seems likely Bostrom thinks we are ‘on the clock’ more than Bensinger does, due to other existential risks and the danger of civilizational collapse. This is a reason to be more willing to risk undershoot to prevent overshoot.
I also think that Bensinger has gotten a sense that Bostrom’s update is much bigger than it was, exactly because of the framing of this discussion. Bostrom says there is a small possibility of this kind of overshoot.
Quoted Text in Detail
First, Bostrom echoes a basic principle almost everyone agrees with, including Eliezer Yudkowsky, who has said it explicitly many times. I agree as well.
The second thing Bostrom said is that there is a small danger we might overshoot, and indeed not create AI, and we should try to avoid that.
I often get whiplash between the ‘AI cannot be stopped and all your attempts to do so only at most slow things down and thereby make everything worse in every way’ and ‘AI could be stopped rather easily, we are in danger of doing that if we , and that would be the biggest tragedy possible, so we need to move as fast as possible and never worry about the risks lest that happen.’
And yes, some people will switch between those statements as convenient.
The Dial of Progress, the danger that we are incapable of any nuance, here within AI. And yes, I worry about this too. Perhaps all we have, at the end of the day, is the wrecking ball. I will keep fighting for nuance. But if ultimately we must choose, and all we have is the wrecking ball, we do not have the option to not swing it. ‘Goldilocks level of feeling’ is not something our civilization does well.
If it’s actually never, then yes, that is tragic. But I say far less tragic than everyone dying. I once again, if forced to choose, choose Box A. To not make a machine in the image of a human mind. There is some price or level of risk that gets me to choose Box B, it is more than 2%, but it is far less than both 50% and my current estimate of the risks of going forward under the baseline scenario, should we succeed at building AGI.
Perhaps you care to speak directly into the microphone and disagree.
I strongly agree overshoot is looking a lot more possible now than months ago.
So to be clear, Nick Bostrom continues to think we are insufficiently concerned now, but is worried we might have an overshoot if things go too far, as is confirmed next.
Again, yes, many such cases. What changed for Bostrom is not that he did not previously believe there was enough chance we would overshoot to be worth worrying about. Now he thinks it is big enough to consider, and that a small possibility of a very bad thing is worth worrying about. Quite so.
I asked GPT-4, which said this is an expansion of his previous position from Superintelligence, adding nuance, but it does not contradict it, and could not recall any comments by Bostrom on that question at all.
To be clear, yes, he now says the third thing, he regrets the focus of work, although it does not seem from his other beliefs like he should regret it?
That sounds like it was right to raise the level of concern in 2014, and right up until at least mid-2023? I am confused.
The Broader Podcast Context
If one listens to the full context, that which is scarce, you see a podcast whose first ~80% was almost entirely focused on Bostrom warning about various issues of existential risk from AI. The quoted text was the last ~20% of the podcast. That does not seem like someone that regretful about focusing on that issue.
Around 9:50 Bostrom notes that he still expects fast takeoff, at least relative to general expectations.
At about 12:30 he discusses the debate about open source, noting that any safeguards in open source models will be removed.
Around 14:00 he says AI will increase the power of surveillance by a central power, including over what people are thinking.
Around 17:00 he discusses the potential of AI to reinforce power structures including tyrannical ones.
Around 20:00 he introduces the alignment problem and attempts to explain it.
Around 22:00 they discuss the clash between Western liberal values and the utilitarianism one would expect in an AI briefly and Bostrom pivots back to talk more about why alignment is hard.
Around 26:00 Flo Read raises concern about powerful getting superintelligence first and taking control, then asks about military applications. Bostrom seems not to be getting through that the central threat isn’t that power would go to the wrong people.
I worry that much of the discussion was simultaneously covering a lot of basic territory, with explanations too dense and difficult for those encountering it for the first time. It was all very good, but also very rushed.
I do think this did represent a substantial shift in emphasis from this old Q&A, where his response to whether we should build an AI was ‘not any time soon’ but he does still posit ideas like the long reflection and endorse building the AI once we know how to do so safety.
A Call for Nuance
Taking it all together, it seems to me that Bostrom:
Which all seems great? Also highly miscategorized by all but one of the responses I saw from those who downplays existential risk. The exception was the response from Marc Andreessen, which was, and I quote in its entirety, “FFS.” Refreshingly honest and straightforward for all of its meanings. Wastes no words, 10/10, no notes. He abides no quarter, accepts no compromise.
This all also conflates worries about existential risk with more general FUD about AI, which again is the worry that there is no room for such nuance, that one cannot differentially move one without the other. But Bostrom himself shows that this is indeed possible. Who can doubt that the world without Bostrom would have reduced existential risk concern, and proportionally far less reduction in concerns about AI in general or about mundane harms?
It is coherent to say that while we have not overshot on general levels of AI worry yet, that the natural reaction to growing capabilities, and the social and political dynamics involved, will themselves raise the concern level, and that on the margin pushing towards more concern now could be counterproductive. I presume that is Bostrom’s view.
I agree that this dynamic will push concern higher. I disagree that we are on track for sufficient concern, and I definitely disagree that we would be on such track if people stopped pushing for more concern.
I especially worry that the concern will be the wrong concerns. That we will thus take the wrong precautions, rather than the right ones. But the only way to stop that is to be loud about the right concerns, because the ‘natural’ social and political forces will be on (often real but) non-central, and thus essentially wrong, concerns.
Again, nuance is first best and I will continue to fight for that.
One way of thinking about this is, the ideal level of concern, C, is a function of the level of nuance N or quality of response Q. As N and Q go up, C goes down, both because actual concern goes down - we’re more likely to get this right - and also because we get to substitute out of blunt reactions and thus require less concern to take necessary countermeasures.
Here's Rob Miles, nailing it.
Thus, if you want to lower level of concerns and reduce attempts to increase concern? Fight with me for more and better nuance, responses that might actually work, and for strong implementation details.
The Quoted Text Continued
The fourth thing Bostrom says is that we will eventually face other existential risks, and AGI could help prevent them. No argument here, I hope everyone agrees, and that we are fully talking price.
That seems reasonable. This is a good preference, and would be of greater concern if we seemed close to either of those things, and we had to, as they say above, pick our poison. And talk price. There are those (such as Scott Aaronson) who have said outright they think the price here plausibly justifies the path, that the existential risks of waiting exceed those of not waiting. I strongly disagree. I also expect for what it is worth that it will be decades, unless the risks here are enabled by AI, before I am more worried about nanotechnology or synthetic biology than about nuclear war.
My model here is that current levels of AI raise such risks of these additional dangers rather than lower them, and I expect this to continue roughly until we reach AGI, at which point things get weird and it could go either way and it mostly doesn’t much matter because we have either bigger problems or good solutions, so effectively that will lower the cost such risks dramatically.
Conclusion
Nuance is one key to our survival.
It is not sufficient to choose the ‘right level of concern about AI’ by turning the dial of progress. If we turn it too far down, we probably get ourselves killed. If we turn it too far up, it might be a long time before we ever build AGI, and we could lose out on a lot of mundane utility, face a declining economy and be vulnerable over time to other existential and catastrophic risks.
A successful middle path is only viable if we can choose interventions that are well-considered and well-targeted, that actually prevent sufficiently capable AI from being created until we know how to do so safely and navigate what comes next, while also allowing us to enjoy the benefits of mundane AI and that doesn’t shut the door to AI completely. Bostrom says such a path is unlikely but possible. It still seems to me to require some things that seem very hard to pull off, even with much larger levels of fear and hatred of AI.
Regulations often find a way to do exactly the wrong thing, as accelerationists commonly point out but also sometimes lobby and directly aim for in this case - letting frontier model development continue as fast as possible, while all regulations that do exist take away mundane utility. And there will over time be increasing temptation and ability to train a more powerful AI anyway. So what to do?
Right now, we do not even know how to stop AI even bluntly, given the coordination required to do so and all the incentives that must be overcome. We certainly have no idea how to find a middle path. What does that world even look like? No, seriously, what does that timeline look like? How does it get there, what does it do? How does it indefinitely maintain sufficient restraints on the actually dangerous AI training runs and deployments without turning everyone against AI more generally?
I don’t know. I do know that the only way to get there is to take the risks seriously, look at them on a technical level, figure out what it takes to survive acceptably often and plotting the best available a path through causal space given all the constraints. Which means facing down the actual arguments, and the real risks. We cannot get there by the right side ‘winning’ a zero-sum conflict.
There are only two ways to react to an exponential such as AI. Too early and too late.
Similarly, there are only two levels of safety precaution when preventing a catastrophe. Too little and too much. If you are not (nontrivially) risking an overreaction, and especially accusations and worries of overreaction, you know you are underreacting. The reverse is also true if the associated costs are so high as to be comparable.
In most of the worlds where we survive, if I am still writing posts at all, I will at various points be saying that many AI regulations and restrictions went too far, or were so poorly chosen as to be negative. Most exceptions are the worlds where we roll the dice in ways I consider very foolish, and then get profoundly lucky. If we get a true ‘soft landing’ and middle path, that would be amazing, but let’s not fool ourselves about how difficult it will be to get there.
It is possible that there is, in practice, no middle path. That our only three available choices, as a planet, are ‘build AGI almost as fast as possible, assume alignment is easy on the first try and that the dynamics that arise after solving alignment can be solved before catastrophe as well,’ ‘build AGI as fast as possible knowing we will likely die because AGI replacing humans is good actually’ or ‘never build a machine in the image of a human mind.’
If that is indeed the case? I believe the choice is clear.