The evidence in this post does not lead me to the conclusion you suggest. It seems consistent with Dario believing in superintelligence, but not preferring to use it to rapidly, radically reshape society (which is good!) and being worried about misuse risks (which is also good). I also find it a bit annoying that you claim he does not have your pointer to superintelligence as if this were self-evident without clarifying either what you believe about "superintelligence" or why you think he's wrong.
believing in superintelligence, but not preferring to use it to rapidly, radically reshape society
I don't think this is a belief that can be coherently held with that preference, and in this case seems indistinguishable from "not believing in superintelligence".
I also find it a bit annoying that you claim he does not have your pointer to superintelligence as if this were self-evident without clarifying either what you believe about "superintelligence" or why you think he's wrong.
I did say the post wasn't very well-written, and am slightly annoyed by how much karma it has, but I do provide some color here:
Several employees confirmed (paraphrasing) that Dario was not as ASI-pilled as they were, and I have yet to any employee object that no, Dario does actually expect to live to see strong nanotech and dyson spheres, and that these concerns are fundamental to how he orients to Anthropic's mission, the potential risks and benefits involved, how to communicate these beliefs to the public, etc.
As for why he's wrong, idk, man, I haven't actually come up with any novel arguments about why vastly smarter than human intelligence is possible or about why this implies much crazier shit than the stuff he describes in Machines of Loving Grace. Within-human variation seems quite large, humans are running on extremely constrained hardware and non-optimized hardware and software, our scientific institutions have not managed to successfully figure out how relevant details about how extremely common illnesses spread, nor propagate the details they have figured out, etc, etc.
Your understanding matches mine, based mostly on conversations with Anthropic staff about his beliefs.
Dario has paid substantial political costs by making an enemy of the Trump administration on the subject of chip export controls (and then, later, Acceptable Use Policies, though this one is weaker because failing to hold the line here could plausibly have cost him more in terms of employee morale/turnover than he lost from the DoW's enmity). This is strongly consistent with being quite worried about misuse risks, particularly from authoritarian governments.
This doesn't seem very strongly relevant to whether he believes in qualitative superintelligence. It's definitely the case that he is way more worried about misuse risks than AI takeover.
Yeah, that's mostly trying to point to evidence that he's truly non-trivially worried about misuse risks, which necessarily reduces the share of catastrophic risk likelihood he attributes to misalignment.
Ok. Note that in most groups, worry about misuse risks (for a variety of definitions of misuse risks) is positively correlated rather than negatively correlated with your estimate of misalignment risk and how plausible you think it is that qualitative superintelligence is developed!
Of course you can be worried about multiple things.
If he has 25% on (unconditional) x-risk, whatever share of that is derived from worlds in which we die from "misuse" is therefore not derived worlds in which we die from misalignment. I guess it's possible that he could be modeling 100% of misuse worlds as also dying to misalignment later in their timelines, so "necessarily" was wrong, but in practice...
Oh, I see the confusion. (My fault for finishing this post at 11 pm.)
I think that bullet point requires the same disclaimer as the third bullet point. Might edit that in the morning.
Here is a metaphor; I can't tell if this is insightful to anyone else: if one is worried about defeating Orcs and Ogres, the objective is not to bet on which of them is more dangerous; the objective is to simultaneously beat Orcs and Ogres.
Dario more or less said in his interview with Ross Douthat that he believes in diminishing returns to intelligence because it's bottlenecked by the need to go out and perform experiments to confirm theories.
Edit: His actual quote:
"Amodei: You don’t have to have the full Machine God. And indeed, there are places where I cast doubt on whether the Machine God would be that much more effective at these things than the 100 million geniuses.
I have this concept called the diminishing returns to intelligence. Economists talk about the marginal productivity of land and labor; we’ve never thought about the marginal productivity of intelligence. But if I look at some of these problems in biology, at some level you just have to interact with the world. At some level, you just have to try things. At some level, you just have to comply with the laws or change the laws on getting medicines through the regulatory system. So there’s a finite rate at which these changes can happen.
Now there are some domains, like if you’re playing chess or go, where the intelligence ceiling is extremely high. But I think the real world has a lot of limiters. Maybe you can go above the genius level, but sometimes I think all this discussion of, “Could you use a moon of computation to make an A.I. god?” is a little bit sensationalistic and besides the point, even as I think this will be the biggest thing that ever happened to humanity."
Though later on, he said:
"Another way you could say about it is maybe A.I. itself, if it didn’t have to care about us humans, could just go off to Mars and build all these automated factories and build its own society and do its own thing.
But that’s not the problem we’re trying to solve. We’re not trying to solve the problem of building a Dyson swarm of artificial robots on some other planet. We’re trying to build these systems, not so they can conquer the world, but so that they can interface with our society and improve that society. And there’s a maximum rate at which that can happen if we actually want to do it in a human and humane way."
So maybe he thinks it's not theoretically impossible to have superintelligence in the radical sense, but in any desirable future, it's not going to be able to make these radical changes as quickly as it could do.
I'm sure this has been discussed elsewhere ad nauseam, but this view always struck me as extremely overconfident: We have no idea what a superintelligence could discover using today's data or experiment throughput capacity, and "close to human level" seems a priori unlikely.
Yeah. Given the level of variance among humans we know significant variance is possible even within the strict bounds of "minds that run on 3 pounds of meat and 20 watts of sugar, most of which is spent on things other than thought." I find this to be a pretty strong argument that the limits at which we hit diminishing returns should be rather far away.
They other side of my mental model is, if (our best model of) the laws of physics fit on a postcard, what exactly does it mean to need to do an experiment, in principle? You need experiments to nail down the laws. Beyond that, they're convenient for reducing computational requirements, often vastly so, but it's not something that prevents you from getting things right on the first try way more often than humans do.
My charitable musing is that maybe Dario genuinely hasn't met anyone whom he judged as smarter than himself by a wide enough margin to develop this intuition via humans, which I think is the somewhat easier path to really feeling the possibility internally than approaching it abstractly/intellectually provides.
if (our best model of) the laws of physics fit on a postcard, what exactly does it mean to need to do an experiment, in principle? You need experiments to nail down the laws. Beyond that, they're convenient for reducing computational requirements
The general point is solid, but you also need experiments to learn contingent things within physics e.g. how biology works.
Yes. I doubt it would be practical for ASI to solve biology by simulating fundamental physics of various 50-100kg lumps of atoms and seeing what matches available data on humans. I also doubt itbwould need anywhere remotely close to the number of experiments we need to draw the lessons it needs to solve any particular biology problem.
The laws of physics tell you how things progress, but don't tell you the starting point. So even if you had the enormous computational power to deduce biology from physics, you would still need to know what a dog is to say something useful about it. So maybe an important question is: would a superintelligence have enough information of earth and its inhabitants so it would not need any more info because it can just reason about it, or would it need to gather more?
I'm sure there is still data it would need to collect. I think it's a mistake to use the amount and type of data humans require as a guide to what that might mean.
I think Dario assumes wrongly the scope of intelligence itself; let's say an agent can improve itself roughly from the level of AlexNet to your average coding agent nowadays (eg Opus 4.6). The gap between these is staggering; even if there is some upper limit who's to say it's close?
The human mind itself is much more efficient than 8 H100s; eventually a self-improving agent would top out at that (or become more computationally efficient than us) and by that point I'd argue you couldn't tell the difference between "very superintelligent" and "wildly superintelligent".
My model of Dario Amodei (and many others) treats superintelligence proper as residing in a separate magisterium, which is importantly different from disbelieving it's possible or near-term likely (shortly after or directly instead of "a country of geniuses in a datacenter"). It's not something he's planning to build, and he feels responsibility for getting the things he's planning to build right. This makes serious theorizing beyond what he's planning to build unvirtuous (in this stance), and perhaps getting too taken with such things seems irresponsible, distracting from the very real and important problems more directly in front of him.
(He might be glad others are working on the other magisterium, and it's virtuous for them to do so. But in this view it's not his area of responsibility, and the lessons they have to teach don't apply to his duties. For the avoidance of doubt in the eyes of others, he'll make sure to distance himself from them. He gets annoyed they're not observing the separation between magisteria, especially when they talk to people relevant to his duties.)
It's the future people and the AIs he's building who have the responsibility to get the longer-term things in order, while the current people need to think about and work on the currently actionable things. AIs doing the alignment homework is not a plan to steer superintelligence (in the stronger senses of the word), not a consequentialist step in a world model that has superintelligence in it. It's just something that will be the responsibility of the world that contains the non-superintelligent AIs he's building, whenever and however that comes up. It's up to that future world to start being consequentialist about superintelligence, not for the current AI companies or governments, which reside in the magisterium of empirically accessible things, while superintelligence proper remains beyond it.
He's expecting to directly build more than others, faster than they expect to build even less than that, and so there is an impression that he's not making the distinction between the empirically accessible and plannable things, and the things that it would be unvirtuous to take too seriously (enough to take control of the current actions). I think all the heads of AI companies are making this distinction (most blatantly Zuckerberg; though possibly not Musk, who is inhabiting the magisterium of hard science fiction at all times, which doesn't help in this case). They are just drawing the line in different places, at different capability levels, between the AIs in the magisterium of empirically relevant and plannable things, and the AIs in the magisterium of idle and ultimately unactionable worry, if not outright science fiction.
Related: when I had a convo with @Drake Thomas while writing Anthropic, and taking "technical philosophy" more seriously, a thing that came up is:
Anthropic epistemic culture is one where you don't make confident claims about speculative things. Your public statements should be things you can back up. (This is part of a more general academic/empiric culture)
Whereas LessWrong epistemic culture is more like "take ideas seriously, follow through the implications, hold your guess about it lightly do your best to model the consequences.")
There's an interpretation of a lot of Dario commentary where he's, like, doing the LW style thinking badly, and/or being deceptive about it. But also an interpretation where he's just not trying to do that sort of communication.
...
Meanwhile, @Zac Hatfield-Dodds once said (also as part of preliminary discussions leading to Anthropic, and taking "technical philosophy" more seriously), "yes, obviously building a massively superhuman thing would be extremely dangerous for the reasons you say. So don't do that."
...
My actual guess is that Dario and some other Anthropic founders do earnestly have an alternate set of virtues that make internal sense, and do have some kinds of plans that route through "build a [non-massively-superintelligent] thing they're pretty sure they can figure out how to build and use it to help humanity...
...and also my read is they are following a bunch of local incentives (money, status, "feeling more in-control"), and playing political games, with Cope-y Optimism about how well their plan will work and suspiciously steering away from strategies that would remotely actually compromise the company.
[edit: "remotely" was clearly wrong hyperbole, sorry]
do have some kinds of plans that route through "build a [non-massively-superintelligent] thing they're pretty sure they can figure out how to build and use it to help humanity
I think they have plans that look like "build the non massively superintelligent thing and hope this bails us out somehow mostly by having it do safety research fast enough (and trying to sure it has good enough judgment and alignment for this to be fine)". I think to a large degree they think that you (very likely?) don't need huge amounts of progress to align the AIs you get within a year or two of full automation of AI R&D (I disagree).
Idk how much this counts as a plan. I don't expect they done a bunch of thinking like "Ah, well XYZ level of capability would suffice for this level of physicial tech progress, and that would suffice for slowing things down enough to be fine, and we can probably handle XYZ level of capability".
I don't understand why you express the opinion that I think you're expressing here as
Dario probably doesn't believe in superintelligence
rather than as
Dario probably anticipates superintelligence as having less impact than I do.
As e.g. Tenoke has said, "a country of geniuses in a datacenter" is, whatever else it may be, definitely something much smarter than a human being.
How do other people use the term? Here's Nick Bostrom, from his book "Superintelligence":
This chapter identifies three different forms of superintelligence, and argues that they are, in a practically relevant sense, equivalent. [...] As previously indicated, we use the term “superintelligence” to refer to intellects that greatly outperform the best current human minds across many very general cognitive domains. [...] Speed superintelligence: A system that can do all that a human intellect can do, but much faster. [...] Collective superintelligence: A system composed of a large number of smaller intellects such that the system’s overall performance across many very general domains vastly outstrips that of any current cognitive system. [...] Quality superintelligence: A system that is at least as fast as a human mind and vastly qualitatively smarter.
The Less Wrong wiki ... actually just quotes Bostrom, in slightly different words from the above:
"An intellect that is much smarter than the best human brains in practically every field, including scientific creativity, general wisdom and social skills."
The Oxford English Dictionary gives three meanings, none of them quite the one we're after here; I mention it because I looked and don't want to be cherry-picking my sources. Wiktionary is less academic but more up-to-date, and says "Intelligence surpassing the level of a human genius." with a few citations that all roughly match that and don't require that said intelligence confer godlike powers or anything of the kind.
There's probably a definition in Yudkowsky & Soares's recent book but I don't have a copy. I had a look at the transcript of his TED talk from 2023, titled "Will superintelligent AI end the world?"; he doesn't define "superintelligence", but it's there in the title, and the scenario he talks about is: "At some point, the companies rushing to scale AI will cough out something that's smarter than humanity".
All these people are using "superintelligence" to mean some variation on the theme of "something much smarter than we are". Many of them think that such a thing would in fact have vast world-changing impact, but they're not making it part of the definition and I don't understand how it makes sense to say that someone "doesn't believe in superintelligence" merely because their estimate of the likely impact of something much smarter than us is different from yours.
Because I'm trying to advance a claim that Dario's (and, to a subtantial degree, Anthropic's) behavior can be more accurately modeled if you take a specific bundle of concepts and beliefs about the world, which I am sloppily calling "superintelligence", and then assume that Dario is much less sold on that bundle than I am. By "much less sold" I mean:
It's possible that the part of the bundle he disagrees with is the "and then the really truly vastly smarter than human intelligence can in fact crack strong nanotech faster than a couple of decades" in a way that doesn't secretly bottom out in him having a pretty different idea from me, of what "really truly vastly smarter than human" is. If someone wants to come up with a better handle for the idea that vast differences in intelligence can produce magical-seeming outcomes extremely quickly, circumventing things that seemed like fundamental bottlenecks to the speed of progress that humans would be limited to, I'd be happy to use that term in addition to "superintelligence".
Fwiw I didn't see the quotes as supporting the conclusion that he doesn't believe in super intelligence - would want your inference to be spelled out more
(I track this as a hypothesis with quite substantial weight for roughly all AGI accelerationists.)
One possibility suggested by this essay is that Dario does not really believe in superintelligence (the hypothesis we are currently examining). Another is that he does, but has chosen to dissemble for strategic purposes. While I don't think Dario is above communicating strategically, I do in fact think this is roughly his mainline worldview, and it follows pretty clearly from his beliefs in 2017. Maybe there are other possibilities apart from those two, though I haven't figured out what they might be.
You have possibly dismissed the "strategic purposes" hypothesis prematurely here.
Saying you want to build "superintelligence" may be really bad for political capital. To me it seems entirely plausible that he has "self-censored" his beliefs around superintelligence since 2017.
Dario spells this out:
The result often ends up reading like a fantasy for a narrow subculture, while being off-putting to most people.
As far as I know[1], Hassabis has also not explicitly said he aims for superintelligence while being CEO of DeepMind, despite being an early singularitarian.
This is mainly based on asking Gemini 3 Pro, though.
Temporarily assuming he was objectively making wrong predictions on this topic, that one had sufficient evidence to show as much, and thus that one could convince him. Would that be a good idea?
And without that assumption: should I update or should he? does this mean he knows things you and I don't? Maybe you and I are both wrong to be thinking it's possible, and actually, he intuitively understands the landscape of possible AIs in a way you don't. Why or why not?
I will say, I sure think I've encountered ideas that would mean "country of geniuses in a datacenter" is a weirdly early place to stop. But I don't particularly like speculating about cool capabilities ideas in public, in case I have good ones. I might post another hash of some thoughts so I can prove later what I was thinking now.
From 2017's "Musings on AI" (35:30):
AI doesn't have to learn all of human morality.
He follows up at 40:19:
We may not need to learn all of human morality and build a sovereign and basically build something that decides what the best way to set up human society is. That problem sounds really hard we may just need to build artifacts that that perform particular engineering tasks for us in order to put the world in a better place.
I tracked this down in order to find out what he thinks is sufficient for a good outcome, if not CEV.
My model for Amodei's plan, then, is a "tool AI" scenario, in which the "engineering tasks" to be solved and carried out by the tool AI, include some things that are meant to improve the general human condition, and probably a "pivotal act" or two, to ensure that the democratic world remains the leading power in human affairs. (This would be reminiscent of Eric Drexler circa 1986, wanting to ensure that the democracies remain ahead of the socialist bloc.) He may envisage that once this better world is stably established, something like CEV would eventually be solved, and eventually give rise to a transhuman civilization in which AIs and humans are on an equal footing; but the important thing is to use tool AI to establish prosperity, democratic politics, etc.
It would be interesting to look at Claude's constitution and Amanda Askell's work from this perspective.
We could summarize this as a “country of geniuses in a datacenter”.
A country of geniuses in a datacenter is pretty clearly "Superintelligence" and he pretty clearly believes in it. He seems to rather belive that Superintelligence wouldn't solve everything quite as quick as others think.
You can also use more recent sources - e.g. 2 months ago here where he discusses the topic with Demis Hassabis - they differ on timelines, how fast it'd achieve things etc. but clearly they both believe in it.
The set of interesting accomplishments he expects from that country of geniuses in a data center, in that essay, are not the things you get out of a superintelligence.
Edit: separately, I am not sure what your comment is supposed to be doing. This entire post is about how there are clearly different conceptions of the term "superintelligence", and that, to the extent that he believes in any of them, he certainly doesn't believe in anything like mine. Unless you mean to claim that I am wrong about how the thing he's describing in MoLG is actually compatible with the kind of superintelligence I'm imagining?
separately, I am not sure what your comment is supposed to be doing.
My comment is a pretty neutral response to the central claim 'Dario probably doesn't believe in superintelligence' which you specify you believe in and isn't just a clickbait headline, and to the arguments for it. Do you react like this to all comments which disagree with you? May I suggest to just comment on the arguments in the comment than to have such a kneejerk reaction?
> Unless you mean to claim that I am wrong about how the thing he's describing in MoLG is actually compatible with the kind of superintelligence I'm imagining?
Your definition is:
> Roughly speaking, that the returns to intelligence past the human level are large, in terms of the additional affordances they would grant for steering the world, and that it is practical to get that additional intelligence into a system.
As I said I think 'a country of supergeniuses in a datacenter' fits, yes:
>A country of geniuses in a datacenter is pretty clearly "Superintelligence" and he pretty clearly believes in it. He seems to rather belive that Superintelligence wouldn't solve everything quite as quick as others think.
I'm not sure why we're arguing about whether a country of geniuses in a data center, as described by Dario, would qualify as superintelligence by my definition of it, when I am telling you that it wouldn't. I agree that the definition I gave in my post does not conclusively rule that out, but 1) words are hard, man, 2) the rest of my post really clearly implies that I don't think the thing he's describing counts. So if you have a gripe, it should probably be with my definition. Sorry for being snippy.
Anyways, Dario doesn't believe in my pointer to superintelligence, which I will decline to define further at this hour of the night, but I expect most people here to understand what I mean.
I think this would be very interesting to follow-up (at a more reasonable hour of the day).
So, Dario indeed seems not to believe in a quick ASI takeover, and in this sense his definition does seem to differ from yours.
But the question is how does this decompose into differences on:
inherently achievable levels of intelligence
inherent resistance of the world to changes induced by super-high levels of intelligence
ability to have those super-high levels of intelligence and the presence of necessary affordances for radical changes (including the ASI takeover), but also the ability to agree to voluntary curtail the extent of those changes (including refraining from a “true takeover”)
My guess (which might be incorrect) is that your main differences with Dario’s viewpoint are on 2), and to some extent perhaps on 3), but less so on 1). So I think it’s worth a follow-up.
(Thanks for the post, it’s very interesting.)
Epistemic status: I think the headline claim is true, and that the evidence within is actually quite strong in a bayesian sense, but don't think the post itself is very well written or particularly interesting. But I had to get 500 words out! I think the 2013 conversation is interesting reading as a piece of history, separate from the top-level question, and recommend reading that.
I think many people have a relationship with Anthropic that is premised on a false belief: that Dario Amodei believes in superintelligence.
What do I mean by "believes" in superintelligence? Roughly speaking, that the returns to intelligence past the human level are large, in terms of the additional affordances they would grant for steering the world, and that it is practical to get that additional intelligence into a system.
There are many pieces of evidence which suggest this, going quite far back.
In 2013, Dario was one of two science advisors (along with Jacob Steinhardt) that Holden brought along to a discussion with Eliezer and Luke about MIRI strategy. A transcript of the conversation is here. It is the first piece of public communication I can find from Dario on the subject. Read end-to-end, I don't think it strongly supports my titular claim. However, there is this quote:
This is not the kind of sentence you say if the concept you have loaded in your head is the same concept I have for "superintelligence". I do not think the context particularly rescues it.
In 2016, Dario was first author on Concrete Problems in AI Safety. I understand that this was an academic publication. Nevertheless, I think this passage is suggestive:
The next relevant pieces of evidence are from a panel he was part of at EAG 2017, "Musings on AI" (yt link). Here there are multiple relevant quotes (bolding mine):
Those are not the kinds of sentences you say if you have the same "superintelligence" pointer as me, and you think that is what is actually at the end of the tunnel (as opposed to being a possible but pretty low likelihood outcome).
Those are, admittedly, the sentences of someone who might have the same "superintelligence" pointer as me, but not someone who has asked themselves how to get to a safe point and then stop without accidentally crossing a dangerous threshold (or letting anyone else do so).
And then, alas, we have Machines of Loving Grace (2024).
One possibility suggested by this essay is that Dario does not really believe in superintelligence (the hypothesis we are currently examining). Another is that he does, but has chosen to dissemble for strategic purposes. While I don't think Dario is above communicating strategically, I do in fact think this is roughly his mainline worldview, and it follows pretty clearly from his beliefs in 2017. Maybe there are other possibilities apart from those two, though I haven't figured out what they might be.
The Adolescence of Technology (2026) also contains many relevant details, which I will fail to quote.
Beyond textual evidence, let me include a few other lines of evidence: