Meta: I think these kinds of posts should include some sort of disclaimer acknowledging that you are an OpenAI employee & also mentioning whether or not the post was reviewed by OpenAI staff, OpenAI comms, etc.
I imagine you didn't do this because many people who read this forum are aware of this fact (and it's on your profile– it's not like you're trying to hide it), but I suspect this information could be useful for newcomers who are engaging with this kind of material.
Yeah, this omission felt pretty glaring to me. OpenAI is explicitly aiming to build "the most powerful technology humanity has yet invented." Obviously that doesn't mean Richard is wrong that the AI safety community is too power-seeking, but I would sure have appreciated him acknowledging/grappling with the fact that the company he works for is seeking to obtain more power than any group of people in history by a gigantic margin.
An elephant in the room (IMO) is that moving forward, OpenAI probably benefits from a world in which the AI safety community does not have much influence.
There's a fine line between "play nice with others and be more cooperative" and "don't actually advocate for policies that you think would help the world, and only do things that the Big Companies and Their Allies are comfortable with."
Again, I don't think Richard sat in his room and thought "how do I spread a meme that is good for my company." I think he's genuinely saying what he believes and giving advice that he thinks will be useful to the AI safety community and improve society's future .
But I also think that one of the reasons why Richard still works at OpenAI is because he's the kind of agent who genuinely believes things that tend to be pretty aligned with OpenAI's interests, and I suspect his perspective is informed by having lots of friends/colleagues at OpenAI.
Someone who works for a tobacco company can still have genuinely useful advice for the community of people concerned about the health effects of smoking. But I still think it's an important epistemic norm that they add (at least) a brief disclai...
I appreciate you adding the note, though I do think the situation is far more unusual than described. I agree it's widely priced in that companies in general seek power, but I think probably less so that the author of this post personally works for a company which is attempting to acquire drastically more power than any other company ever, and that much of the behavior the post describes as power-seeking amounts to "people trying to stop the author and his colleagues from attempting that."
I think that vast majority of comparative claims (like "AI Safety community is more X than any other advocacy group") is more based on vibes than facts. Are you sure than Sierra Club, Club of Rome, Mont Pelerin Society, Fabian Society, Anti-Defamation League et cetera are less power-seeking than AI Safety community? Are you sure that "let's totally reorganize society around ecological sustainability" is less power-seeking than "let's ensure that AGI corporations have management that is not completely blind to alignment problem"?
Backlash for environmentalism was largely inevitable. The whole point of environmentalism is to internalize externalities in some way, i.e., impose costs of pollution/ecological damage on polluters. Nobody likes to get new costs, so backlash ensues.
My understanding of MIRI plan was "have a controllable, safe AI that's just powerful enough to take some action that prevents anyone else from building a more powerful and more dangerous AI". I wouldn't call that God-like or an intention to take over the world. The go-to [acknowledged as that plausible] example is "melt all the GPUs"] Your description feels grossly inaccurate.
Basically any plan of the form "use AI to prevent anyone from building more powerful and more dangerous AI" is incredibly power-grabbing by normal standards: in order to do this, you'll have to take actions that start out as terrorism and then might quickly need to evolve into insurrection (given that the government will surely try to coerce you into handing over control over the AI-destroying systems); this goes against normal standards for what types of actions private citizens are allowed to take.
I agree that "obtain enough hard power that you can enforce your will against all governments in the world including your own" is a bit short of "try to take over the world", but I think that it's pretty world-takeover-adjacent.
I mean, it really matters whether you are suggesting someone else to take that action or whether you are planning to take that action yourself. Asking the U.S. government to use AI to prevent anyone from building more powerful and more dangerous AI is not in any way a power-grabbing action, because it does not in any meaningful way make you more powerful (like, yes, you are part of the U.S. so I guess you end up with a bit more power as the U.S. ends up with more power, but that effect is pretty negligible). Even asking random AI capability companies to do that is also not a power-grabbing action, because you yourself do not end up in charge of those companies as part of that.
Yes, unilaterally deploying such a system yourself would be, but I have no idea what people are referring to when they say that MIRI was planning on doing that (maybe they were, but all I've seen them do is to openly discuss plans about what ideally someone with access to a frontier model should do in a way that really did not sound like it would end up with MIRI meaningfully in charge).
I think they talked explicitly about planning to deploy the AI themselves back in the early days(2004-ish) then gradually transitioned to talking generally about what someone with a powerful AI could do.
But I strongly suspect that in the event that they were the first to obtain powerful AI, they would deploy it themselves or perhaps give it to handpicked successors. Given Eliezer's worldview I don't think it would make much sense for them to give the AI to the US government(considered incompetent) or AI labs(negligently reckless)
I think making inferences from that to modern MIRI is about as confused as making inferences from people's high-school essays about what they will do when they become president
Yeah, but it's not just the old MIRI views, but those in combination with their statements about what one might do with powerful AI, the telegraphed omissions in those statements, and other public parts of their worldview e.g. regarding the competence of the rest of the world. I get the pretty strong impression that "a small group of people with overwhelming hard power" was the ideal goal, and that this would ideally be controlled by MIRI or by a small group of people handpicked by them.
All of that sounds right to me. But this pivot with regards to means isn't much evidence about what Eliezer/MIRI would do if they (as a magical hypothetical) suddenly found themselves with a verifiably-aligned CEV AGI.
I expect that they would turn it on, with the expectation that it would develop a hard power decisive strategic advantage, use that to end the acute risk period, and then proceed to optimize the universe.
Insofar as that's true, I think Oliver's statement above...
and would absolutely definitely not include the ability of whoever builds AGI to just take over the world with it.
...is inaccurate.
MIRI has never said, to my knowledge,
...We used to think that if a small team could build a verifiably-aligned CEV AI, that they should unilaterally turn it on, knowing that that will likely result in the relative disempowerment of many human institutions and existing human leaders. We once planned to do that ourselves.
We now think that was a mistake, not just because building a verifiably-aligned CEV AI is unworkably hard, but because unilaterally seizing a hard power advantage, even in the seizing a hard power advantage, even in the service of CEV, is an act
My understanding was there were 4 phases in which the Singularity Institute / MIRI had 4 different plans.
I think the plan implies having the capability that if you wanted to, you could take over the world, but having the power to do something and actually doing it are quite different. When you say "MIRI wanted to take over the world", the central meanings of that that come to mind for me is "take over all the governments, be in charge of all the laws and decision-making, be world dictator, take possession of all the resources" and probably also "steer humanity's future in a very active way". Which is very very not their intention and if someone goes around saying MIRI's plan was to take over the world without any clarification leaving the reader to think the above, then I think they're being very darn misleading.
Just block all AI projects and maintain your ability to continue doing so in the future via maintaining military supremacy.
That to me is a very very non-central case of "take over the world", if it is one at all.
This is about "what would people think when they hear that description" and I could be wrong, but I expect "the plan is to take over the world" summary would lead people to expect "replace governments" level of interference, not "coerce/trade to ensure this specific policy" - and there's a really really big difference between the two.
I think this whole debate is missing the point I was trying to make. My claim was that it's often useful to classify actions which tend to lead you to having a lot of power as "structural power-seeking" regardless of what your motivations for those actions are. Because it's very hard to credibly signal that you're accumulating power for the right reasons, and so the defense mechanisms will apply to you either way.
In this case MIRI was trying to accumulate a lot of power, and claiming that they were aiming to use it in the "right way" (do a pivotal act) rather than the "wrong way" (replacing governments). But my point above is that this sort of claim is largely irrelevant to defense mechanisms against power-seeking.
(Now, in this case, MIRI was pursuing a type of power that was too weird to trigger many defense mechanisms, though it did trigger some "this is a cult" defense mechanisms. But the point cross-applies to other types of power that they, and others in AI safety, are pursuing.)
The AI safety community is structurally power-seeking.
I don't think the set of people interested in AI safety is a even a "community" given how diverse it is (Bengio, Brynjolfsson, Song, etc.), so I think it's be more accurate to say "Bay Area AI alignment community is structurally power-seeking."
I am kinda confused by these comments. Obviously you can draw categories at higher or lower levels of resolution. Saying that it doesn't make sense to put Lightcone and MIRI in the same bucket as Constellation and OpenPhil, or Bengio in the same bucket as the Bay Area alignment community, feels like... idk, like a Protestant Christian saying it doesn't make sense to put Episcopalians and Baptists in the same bucket. The differences loom large for insiders but are much smaller for outsiders.
You might be implicitly claiming that AI safety people aren't very structurally power-seeking unless they're Bay Area EAs. I think this is mostly false, and in fact it seems to me that people often semi-independently reason themselves into power-seeking strategies after starting to care about AI x-risk. I also think that most proposals for AI safety regulation are structurally power-seeking, because they will make AI safety people arbitrators of which models are allowed (implicitly or explicitly). But a wide range of AI safety people support these (and MIRI, for example, supports some of the strongest versions of these).
I'll again highlight that just because an action is structurally power-seeking doesn't make it a bad idea. It just means that it comes along with certain downsides that people might not be tracking.
I don't know, I think I'll defend that Lightcone is genuinely not very structurally power-seeking, and neither is MIRI, and also that both of these organizations are not meaningfully part of some kind of shared power-base with most of the EA AI Alignment community in Berkeley (Lightcone is banned from receiving any kind of OpenPhil funding, for example).
I think you would at least have to argue that there are two separate power-seeking institutions here, each seeking power for themselves, but I also do genuinely think that Lightcone is not a very structurally power-seeking organization (I feel a bit more confused about MIRI, though would overall still defend that).
Suppose I'm an atheist, or a muslim, or a jew, and an Episcopalian living in my town came up to me and said "I'm not meaningfully in a shared power-base with the Baptists. Sure, there's a huge amount of social overlap, we spend time at each other's churches, and we share many similar motivations and often advocate for many of the same policies. But look, we often argue about theological disagreements, and also the main funder for their church doesn't fund our church (though of course many other funders fund both Baptists and Episcopalians)."
I just don't think this is credible, unless you're using a very strict sense of "meaningfully". But at that level of strictness it's impossible to do any reasoning about power-bases, because factional divides are fractal. What it looks like to have a power-base is to have several broadly-aligned and somewhat-overlapping factions that are each seeking power for themselves. In the case above, the Episcopalian may legitimately feel very strongly about their differences with the Baptists, but this is a known bug in human psychology: the narcissism of small differences.
Though I am happy to agree that Lightcone is one of the least structurally power-seeking entities in the AI safety movement, and I respect this. (I wouldn't say the same of current-MIRI, which is now an advocacy org focusing on policies that strongly centralize power. I'm uncertain about past-MIRI.)
I think you're making an important point here, and I agree that given the moral valence people here will be quite tempted to gerrymander themselves out of the relevant categories (also, pretending to be the underdog, or participating in bravery debates, is an extremely common pattern in conversations like this).
I do agree that a few years ago things would have been better modeled as a shared power base, but I think a lot of this has genuinely changed post-FTX.
I also think there are really crucial differences in how much different sub-parts of this ecosystem are structurally-power-seeking, and that those are important to model (and also importantly that some of the structural power-seeking-ness of some these parts puts those parts into conflict with the others, in as much as they they are not participating in the same power-seeking strategies).
Like, the way I have conceptualized most of my life's work so far has been to try to build neutral non-power-seeking institutions, that inform other people and help them make better decisions, and that generally try to actively avoid plans that route through "me and my friends get powerful and then solve our problems" because I think this kind...
I do think that modeling the AI Safety space as a single power-base is wrong and not really carving reality along structural lines.
This is the thing that feels most like talking past each other. You're treating this as a binary and it's really, really not a binary. Some examples:
I'm not denying that there are crucial differences to model here. But this just seems like the wrong type of argument to use to object to accusations of gerrymandering, because every example of gerrymandering will be defended with "here are the local differences that feel crucial to me".
So how should we evaluate this in a principled way? One criterion: how fierce is the internal fighting? Another: how many shared policy prescriptions do the different groups have? On the former, ...
This is the thing that feels most like talking past each other. You're treating this as a binary and it's really, really not a binary. Some examples:
Yeah, I think this makes sense. I wasn't particularly trying to treat it as just a binary, and I agree that there are levels of abstraction where it makes sense to model these things as one, and this also applies to the whole extended AI-Alignment/EA/Rationality ecosystem.
I do feel like this lens loses a lot of its validity at the highest levels of abstraction (like, I think there is a valid sense in which you should model AI x-risk concerned people as part of big-tech, but also, if you do that, you kind of ignore the central dynamic that is going on with the x-risk concerned people, and maybe that's the right call sometimes, but I think in terms of "what will the future of humanity be" in making that simplification you have kind of lost the plot)
If I'm wrong about this, I'd love to know.
My best guess is you are underestimating the level of adversarialness going on, though I am also uncertain about this. I would be interested in sharing notes some time.
As one concrete example, my guess is we both agree it would not make sen...
I have spent like 40% of the last 1.5 years trying to reform EA. I think I had a small positive effect, but it's also been extremely tiring and painful and I consider my duty with regards to this done. Buy in for reform in leadership is very low, and people seem primarily interested in short term power seeking and ass-covering.
The memo I mentioned in another comment has a bunch of analysis I'll send it to you tomorrow when I am at my laptop.
For some more fundamental analysis I also have this post, though it's only a small part of the picture: https://www.lesswrong.com/posts/HCAyiuZe9wz8tG6EF/my-tentative-best-guess-on-how-eas-and-rationalists
I would also be interested in more of your thoughts on this
I have a memo I thought I had shared with you at one point that I wrote for EA Coordination Forum 2023. It has a bunch of wrong stuff in it, and fixing it has been too difficult, but I could share it with you privately (with disclaimers on what is wrong). Feel free to DM me if I haven't.
@habryka are you able to share details/examples RE the actions you've taken to get the EA community to shut down or disappear?
Sharing my memo at the coordination forum is one such action I have taken. I have also advocated for various people to be fired, and have urged a number of external and internal stakeholders to reconsider their relationship with EA. Most of this has been kind of illegible and flaily , with me not really knowing how to do anything in the space without ending up with a bunch of dumb collateral damage and reciprocal escalation.
The leadership of these is mostly shared. There are many good parts of EA, and reform would be better than shutting down, but reform seems unlikely at this point.
My world model mostly predicts effects on technological development and the long term future dominate, so in as much as the non-AI related parts of EA are good or bad, I think what matters is their effect on that. Mostly the effect seems small, and quibbling over the sign doesn't super seem worth it.
I do think there is often an annoying motte and bailey going on where people try to critique EA for their negative effects in the important things, and those get redirected to "but you can't possibly be against bednets", and in as much as the bednet people are willingly participating in that (as seems likely the case for e.g. Open Phil's reputation), that seems bad.
Lightcone is banned from receiving any kind of OpenPhil funding
Why?
The reason for the ban is pretty crux-y. Are Lighitcone banned because OpenPhil dislikes you, or because you're too close so that would be a conflict of interests, or something else.
Good Ventures have banned OpenPhil from recommending grants to organizations working in the "rationalist community building" space (including for their non-"rationalist community building" work). I understand this to be because Dustin doesn't believe in that work and feels he suffers a bunch of reputational damage for funding it (IIRC, he said he'd be willing to suffer that reputational damage if he was personally excited by it). Lots more detail on the discussion on this post.
Perhaps the broader point here is that public relations is a complex art, of which we are mostly not even practitioners let alone masters. We should probably learn about it and get better.
I also want to note that there are probably psychological as well as societal defense mechanisms against someone trying to change your worldview. I don't know the name of the phenomena, but this is essentially why counselors/therapists typically avoid giving advice or stating their opinion plainly; the client is prone to rebel against that advice or worldview. I'd suspect this happens because it's terribly dangereous to just let other people tell you how to think; you'll be taken advantage of rather quickly if you do. Obviously there are multiple routes around these defense mechanisms, since people do convince others to change their minds in both subtle and forceful ways. But we should probably learn the theory of how that happens, prior to triggering a bunch of defense mechanisms by going in swinging with amateur enthusiasm (and the unusual perspective of devoted rationalism).
Waiting to speak while polishing our approach seems foolish when time is short. I find very short timelines entirely plaus...
I'm imagining a future post about how society has defense mechanisms against people trying to focus on legitimacy[1] advising us to stop doing that so much :P
1: Public criticism of people trying to persuade the public.
2: Powerful actors refusing to go along with distributed / cooperative plans for the future.
3: Public criticism of anyone trying to make Our Side give up power over the future.
4: Conspiracy theories about what The Man is trying to persuade you of.
5. The evolution of an accelerationist movement who want to avoid anti-centralization measu
First, I think that thinking about and highlighting these kind of dynamics is important.
I expect that, by default, too few people will focus on analyzing such dynamics from a truth-seeking and/or instrumentally-useful-for-safety perspective.
That said:
Given that OP works for OpenAI, this post reads like when Marc Andreesen complains about the "gigantic amount of money in AI safety".
I think I disagree with some of the claims in this post and I'm mostly sympathetic with the points Akash raised in his comments. Relatedly, I'd like to see a more rigorous comparison between the AI safety community (especially EA/Rationality parts) and relevant reference class movements such as the climate change community.
That said, I think it's reasonable to have a high prior on people ending up aiming for inappropriate levels of power-seeking when taking ambitious actions in the world so it's important to keep these things in mind.
In addition to y...
Claim 2: The world has strong defense mechanisms against (structural) power-seeking.
I disagree with this claim. It seems pretty clear that the world has defense mechanisms against
But it is possible to be power-seeking in other ways. The Gates Foundation has a lot of money and wants other billionaires' money for its cause too. It influences technology development. It has to work with dozens of governments, sometimes lobbying them. Normal think tanks exist to gain influence over govern...
I agree with many of the points expressed in this post, though something doesn't sit right with me about some of the language/phrasing used.
For example, the terms "power-seeking" and "cooperative" feel somewhat loaded. It's not so much that they're inaccurate (when read in a rather precise and charitable way) but moreso that it feels like they have pretty strong connotations and valences.
Consider:
Alice: I'm going to a networking event tonight; I might meet someone who can help me get a job in housing policy!
Bob: That's a power-seeking move.
Alice: Uh....
I can imagine plausible mechanisms for how the first four backlash examples were a consequence of perceived power-seeking from AI safetyists, but I don't see one for e/acc. Does someone have one?
Alternatively, what reason do I have to expect that there is a causal relationship between safetyist power-seeking and e/acc even if I can't see one?
e/acc has coalesced in defense of open-source, partly in response to AI safety attacks on open-source. This may well lead directly to a strongly anti-AI-regulation Trump White House, since there are significant links between e/acc and MAGA.
I think of this as a massive own goal for AI safety, caused by focusing too much on trying to get short-term "wins" (e.g. dunking on open-source people) that don't actually matter in the long term.
e/acc has coalesced in defense of open-source, partly in response to AI safety attacks on open-source. This may well lead directly to a strongly anti-AI-regulation Trump White House
IMO this overstates the influence of OS stuff on the broader e/acc movement.
My understanding is that the central e/acc philosophy is around tech progress. Something along the lines of "we want to accelerate technological progress and AGI progress as quickly as possible, because we think technology is extremely awesome and will lead to a bunch of awesome+cool outcomes." The support for OS is toward the ultimate goal of accelerating technological progress.
In a world where AI safety folks didn't say/do anything about OS, I would still suspect clashes between e/accs and AI safety folks. AI safety folks generally do not believe that maximally fast/rapid technological progress is good for the world. This would inevitably cause tension between the e/acc worldview and the worldview of many AI safety folks, unless AI safety folks decided never to propose any regulations that could cause us to deviate from the maximally-fast pathways to AGI. This seems quite costly.
(Separately, I agree that "dunking on open-...
In a world where AI safety folks didn't say/do anything about OS, I would still suspect clashes between e/accs and AI safety folks.
There's a big difference between e/acc as a group of random twitter anons, and e/acc as an organized political force. I claim that anti-open-source sentiment from the AI safety community played a significant role (and was perhaps the single biggest driver) in the former turning into the latter. It's much easier to form a movement when you have an enemy. As one illustrative example, I've seen e/acc flags that are a version of the libertarian flag saying "come and take it [our GPUs]". These are a central example of an e/acc rallying cry that was directly triggered by AI governance proposals. And I've talked to several principled libertarians who are too mature to get sucked into a movement by online meme culture, but who have been swung in that direction due to shared opposition to SB-1047.
Consider, analogously: Silicon Valley has had many political disagreements with the Democrats over the last decade—e.g. left-leaning media has continuously been very hostile to Silicon Valley. But while the incentives to push back were there for a long time, the organiz...
Presumably, at some point, some groups start advocating for specific policies that go against the e/acc worldview. At that point, it seems like you get the organized resistance.
My two suggestions:
The LessWrong Review runs every year to select the posts that have most stood the test of time. This post is not yet eligible for review, but will be at the end of 2025. The top fifty or so posts are featured prominently on the site throughout the year.
Hopefully, the review is better than karma at judging enduring value. If we have accurate prediction markets on the review results, maybe we can have better incentives on LessWrong today. Will this post make the top fifty?
This post is written in a spirit of constructive criticism. It's phrased fairly abstractly, in part because it's a sensitive topic, but I welcome critiques and comments below. The post is structured in terms of three claims about the strategic dynamics of AI safety efforts; my main intention is to raise awareness of these dynamics, rather than advocate for any particular response to them. Disclaimer: I work at OpenAI, although this is a personal post that was not reviewed by OpenAI.
Claim 1: The AI safety community is structurally power-seeking.
By “structurally power-seeking” I mean: tends to take actions which significantly increase its power. This does not imply that people in the AI safety community are selfish or power-hungry; or even that these strategies are misguided. Taking the right actions for the right reasons often involves accumulating some amount of power. However, from the perspective of an external observer, it’s difficult to know how much to trust stated motivations, especially when they often lead to the same outcomes as self-interested power-seeking.
Some prominent examples of structural power-seeking include:
To be clear, you can’t get anything done without being structurally power-seeking to some extent. However, I do think that the AI safety community is more structurally power-seeking than other analogous communities, such as most other advocacy groups. Some reasons for this disparity include:
Again, these are intended as descriptions rather than judgments. Traits like urgency, consequentialism, etc, are often appropriate. But the fact that the AI safety community is structurally power-seeking to an unusual degree makes it important to grapple with another point:
Claim 2: The world has strong defense mechanisms against (structural) power-seeking.
In general, we should think of the wider world as being very cautious about perceived attempts to gain power; and we should expect that such attempts will often encounter backlash. In the context of AI safety, some types of backlash have included:
These defense mechanisms often apply regardless of stated motivations. That is, even if there are good arguments for a particular policy, people will often look at the net effect on overall power balance when judging it. This is a useful strategy in a world where arguments are often post-hoc justifications for power-seeking behavior.
To be clear, it’s not necessary to avoid these defense mechanisms at all costs. It’s easy to overrate the effect of negative publicity; and attempts to avoid that publicity are often more costly than the publicity itself. But reputational costs do accumulate over time, and also contribute to a tribalist mindset of “us vs them” (as seen most notably in the open-source debate) which makes truth-seeking harder.
Note that most big companies (especially AGI companies) are strongly structurally power-seeking too, and this is a big reason why society at large is so skeptical of and hostile to them. I focused on AI safety in this post both because companies being power-seeking is an idea that's mostly "priced in", and because I think that these ideas are still useful even when dealing with other power-seeking actors.
Claim 3: The variance of (structurally) power-seeking strategies will continue to increase.
Those who currently take AGI and ASI seriously have opportunities to make investments (of money, time, social capital, etc) which will lead to much more power in the future if AI continues to become a much, much bigger deal.
But increasing attention to AI will also lead to increasingly high-stakes power struggles over who gets to control it. So far, we’ve seen relatively few such power struggles because people don’t believe that control over AI is an important type of power. That will change. To some extent this has already happened (with AI safety advocates being involved in the foundation of three leading AGI labs) but as power struggles become larger-scale, more people who are extremely good at winning them will become involved. That makes AI safety strategies which require power-seeking more difficult to carry out successfully.
How can we mitigate this issue? Two things come to mind. Firstly, focusing more on legitimacy. Work that focuses on informing the public, or creating mechanisms to ensure that power doesn’t become too concentrated even in the face of AGI, is much less likely to be perceived as power-seeking.
Secondly, prioritizing competence. Ultimately, humanity is mostly in the same boat: we're the incumbents who face displacement by AGI. Right now, many people are making predictable mistakes because they don't yet take AGI very seriously. We should expect this effect to decrease over time, as AGI capabilities and risks become less speculative. This consideration makes it less important that decision-makers are currently concerned about AI risk, and more important that they're broadly competent, and capable of responding sensibly to confusing and stressful situations, which will become increasingly common as the AI revolution speeds up.
EDIT: A third thing, which may be the most important takeaway in practice: the mindset that it's your job to "ensure" that things go well, or come up with a plan that's "sufficient" for things to go well, inherently biases you towards trying to control other people—because otherwise they might be unreasonable enough to screw up your plan. But trying to control others will very likely backfire for all the reasons laid out above. Worse, it might get you stuck in a self-reinforcing negative loop: the more things backfire, the more worried you are, and so the more control you try to gain, causing further backfiring... So you shouldn't be in that mindset unless you're literally the US President (and maybe not even then). Instead, your job is to make contributions such that, if the wider world cooperates with you, then things are more likely to go well. AI safety is in the fortunate position that, as AI capabilities steadily grow, more and more people will become worried enough to join our coalition. Let's not screw that up.