Here is a short post explaining some of my views on responsible scaling policies, regulation, and pauses I wrote it last week in response to several people asking me to write something. Hopefully this helps clear up what I believe.
I don’t think I’ve ever hidden my views about the dangers of AI or the advantages of scaling more slowly and carefully. I generally aim to give honest answers to questions and present my views straightforwardly. I often point out that catastrophic risk would be lower if we could coordinate to build AI systems later and slower; I usually caveat that doing so seems costly and politically challenging and so I expect it to require clearer evidence of risk.
I think this post is quite misleading and unnecessarily adversarial.
I'm not sure if I want to engage futher, I might give examples of this later. (See examples below)
(COI: I often talk to and am friendly with many of the groups criticized in this post.)
Examples:
Thanks for the response, one quick clarification in case this isn't clear.
On:
For instance, I think that well implemented RSPs required by a regulatory agency can reduce risk to <5% (partially by stopping in worlds where this appears needed).
I assume this would be a crux with Connor/Gabe (and I think I'm at least much less confident in this than you appear to be).
It's worth noting here that I'm responding to this passage from the text:
In a saner world, all AGI progress should have already stopped. If we don’t, there’s more than a 10% chance we all die.
Many people in the AI safety community believe this, but they have not stated it publicly. Worse, they have stated different beliefs more saliently, which misdirect everyone else about what should be done, and what the AI safety community believes.
I'm responding to the "many people believe this" which I think implies that the groups they are critiquing believe this. I want to contest what these people believe, not what is actually true.
Like many of therse people think policy interventions other than pause reduce X-risk below 10%.
Maybe I think something like (numbers not well considered):
As an aside, I think it's good for people and organizations (especially AI labs) to clearly state their views on AI risk, see e.g., my comment here. So I agree with this aspect of the post.
Stating clear views on what ideal government/international policy would look like also seems good.
(And I agree with a bunch of other misc specific points in the post like "we can maybe push the overton window far" and "avoiding saying true things to retain respectability in order to get more power is sketchy".)
(Edit: from a communication best practices perspective, I wish I noted where I agree in the parent comment than here.)
(Conflict of interest note: I work at ARC, Paul Christiano's org. Paul did not ask me to write this comment. I first heard about the truck (below) from him, though I later ran into it independently online.)
There is an anonymous group of people called Control AI, whose goal is to convince people to be against responsible scaling policies because they insufficiently constraint AI labs' actions. See their Twitter account and website (also anonymous Edit: now identifies Andrea Miotti of Conjecture as the director). (I first ran into Control AI via this tweet, which uses color-distorting visual effects to portray Anthropic CEO Dario Amodei in an unflattering light, in a way that's reminiscent of political attack ads.)
Control AI has rented a truck that had been circling London's Parliament Square. The truck plays a video of "Dr. Paul Christiano (Made ChatGPT Possible; Government AI adviser)" saying that there's a 10-20% chance of an AI takeover and an overall 50% chance of doom, and of Sam Altman saying that the "bad case" of AGI is "lights out for all of us". The back of the truck says "Responsible Scaling: No checks, No limits, No control". The video of Paul seems to me to be an attack...
The About Us page from the Control AI website has now been updated to say "Andrea Miotti (also working at Conjecture) is director of the campaign." This wasn't the case on the 18th of October.
Thumbs up for making the connection between the organizations more transparent/clear.
The video of Paul seems to me to be an attack on Paul (but see Twitter discussion here).
This doesn't seem right. As the people in the Twitter discussion you link say, it seems to mostly use Paul as a legitimate source of an x-risk probability, with maybe also a bit of critique of him having nevertheless helped build chat-GPT, but neither seems like an attack in a strictly negative sense. It feels like a relatively normal news snippet or something.
I feel confused about the truck. The video seems fine to me and seems kind of decent advocacy. The quotes used seem like accurate representations of what the people presented believe. The part that seems sad is that it might cause people to think the ones pictured also agree with other things that the responsible scaling website says, which seems misleading.
I don't particularly see a reason to dox the people behind the truck, though I am not totally sure. My bar against doxxing is pretty high, though I do care about people being held accountable for large scale actions they take.
Connor/Gabriel -- if you are connected with Control AI, I think it's important to make this clear, for a few reasons. First, if you're trying to drive policy change, people should know who you are, at minimum so they can engage with you. Second, I think this is particularly true if the policy campaign involves attacks on people who disagree with you. And third, because I think it's useful context for understanding this post.
This seems like a general-purpose case against anonymous political speech that contains criticism ("attacks") of the opposition. But put like that, it seems like there are lots of reasons people might want to speak anonymously (e.g. to shield themselves from unfair blowback). And your given reasons don't seem super persuasive - you can engage with people who say they agree with the message (or do broad-ranged speech of your own), reason 2 isn't actually a reason, and the post was plenty understandable to me without the context.
I believe you're wrong on your model of AI risk and you have abandoned the niceness/civilization norms that act to protect you from the downside of having false beliefs and help you navigate your way out of them. When people explain why they disagree with you, you accuse them of lying for personal gain rather than introspect about their arguments deeply enough to get your way out of the hole you're in.
First, this is a minor point where you're wrong, but it's also a sufficiently obvious point that it should hopefully make clear how wrong your world model is: AI safety community in general, and DeepMind + Anthropic + OpenAI in particular, have all made your job FAR easier. This should be extremely obvious upon reflection, so I'd like you to ask yourself how on earth you ever thought otherwise. CEOs of leading AI companies publicly acknowledging AI risk has been absolutely massive for public awareness of AI risk and its credibility. You regularly bring up how CEOs of leading AI companies acknowledge AI risk as a talking point, so I'd hope that on some level you're aware that your success in public advocacy would be massively reduced in the counterfactual case where the leading A...
Eighth, yes, working in AI capabilities is absolutely a reasonable alignment plan that raises odds of success immensely. I know, you're so overconfident on this point that even reading this will trigger you to dismiss my comment. And yet it's still true - and what's more, obviously so. I don't know how you and others egged each other into the position that it doesn't matter whether the people working on AI care about AI risk, but it's insane.
I agreed with most of your comment until this line. Is your argument that, there's a lot of nuance to getting safety right, we're plausibly in a world where alignment is hard but possible, and the makers of AGI deeply caring about alignment, being cautious and not racing, etc, could push us over the line of getting alignment to work? I think this argument seems pretty reasonable, but that you're overstating the case here and that this strategy could easily be net bad if you advance capabilities a lot. And "alignment is basically impossible unless something dramatically changes" also seems like a defensible position to me
First, this is a minor point where you're wrong, but it's also a sufficiently obvious point that it should hopefully make clear how wrong your world model is: AI safety community in general, and DeepMind + Anthropic + OpenAI in particular, have all made your job FAR easier. This should be extremely obvious upon reflection, so I'd like you to ask yourself how on earth you ever thought otherwise. CEOs of leading AI companies publicly acknowledging AI risk has been absolutely massive for public awareness of AI risk and its credibility. You regularly bring up how CEOs of leading AI companies acknowledge AI risk as a talking point, so I'd hope that on some level you're aware that your success in public advocacy would be massively reduced in the counterfactual case where the leading AI orgs are Google Brain, Meta, and NVIDIA, and their leaders were saying "AI risk? Sounds like sci-fi nonsense!"
The fact that people disagree with your preferred method of reducing AI risk does not mean that they are EVIL LIARS who are MAKING YOUR JOB HARDER and DOOMING US ALL.
I disagree this is obviously wrong. I think you are not considering the correct counterfactual. From Connor L.'s point of...
Yeah, fair enough.
But I don't think that would be a sensible position. The correct counterfactual is in fact the one where Google Brain, Meta, and NVIDIA led the field. Like, if DM + OpenAI + Anthropic didn't exist - something he has publicly wished for - that is in fact the most likely situation we would find. We certainly wouldn't find CEOs who advocate for a total stop on AI.
(Note: I work with Paul at ARC theory. These views are my own and Paul did not ask me to write this comment.)
I think the following norm of civil discourse is super important: do not accuse someone of acting in bad faith, unless you have really strong evidence. An accusation of bad faith makes it basically impossible to proceed with discussion and seek truth together, because if you're treating someone's words as a calculated move in furtherance of their personal agenda, then you can't take those words at face value.
I believe that this post violates this norm pretty egregiously. It begins by saying that hiding your beliefs "is lying". I'm pretty confident that the sort of belif-hiding being discussed in the post is not something most people would label "lying" (see Ryan's comment), and it definitely isn't a central example of lying. (And so in effect it labels a particular behavior "lying" in an attempt to associate it with behaviors generally considered worse.)
The post then confidently asserts that Paul Christiano hides his beliefs in order to promote RSPs. This post presents very little evidence presented that this is what's going on, and Paul's account seems consistent ...
Man, I agree with almost all the content of this post, but dispute the framing. This seems like maybe an oportunity to write up some related thoughts about transparency in the x-risk ecosystem.
A few months ago, I had opportunity to talk with a number of EA-aligned or x-risk concerned folks working in policy or policy adjacent roles as part of a grant evaluation process. My views here are informed by those conversations, but I am overall quite far from the action of AI policy stuff. I try to carefully flag my epistemic state regarding the claims below.
I think a lot of people, especially in AI governance, are...
A central example is promoting regulations on frontier AI systems because powerful AI systems could develop bio-weapons that could be misused to wipe out large swaths of humanity.
I think that most of the people promoting that policy agenda with that argumentation, do in fact think that AI-developed bioweapons are a real risk of the next 15 years. And, I guess, many to most ...
If a person was asked point-blank about the risk AI takeover, and they gave an answer that implied the risk was lower than they think it is, in private, I would consider that a lie
[...]
That said, my guess is that many of the people that I'm thinking of, in these policy positions, if they were asked, point blank, might lie in exactly that way. I have no specific evidence of that, but it does seem like the most likely way many of them would respond, given their overall policy about communicating their beliefs.
As a relevant piece of evidence here, Jason Matheny, when asked point-blank in a senate committee hearing about "how concerned should we be about catastrophic risks from AI?" responded with "I don't know", which seems like it qualifies as a lie by the standard you set here (which, to be clear, I don't super agree with and my intention here is partially to poke holes in your definition of a lie, while also sharing object-level relevant information).
See this video 1:39:00 to 1:43:00: https://www.hsgac.senate.gov/hearings/artificial-intelligence-risks-and-opportunities/
Quote (slightly paraphrased because transcription is hard):
...Senator Peters: "The last question be
If his beliefs are what I would have expected them to be (eg something like "agrees with the basic arguments laid out in Superintelligence, and was motivated to follow his current carrer trajectory by those arguments"), then this answer is at best, misleading and misrepresentation of his actual models.
Seeing this particular example, I'm on the fence about whether to call it a "lie". He was asked about the state of the world, not about his personal estimates, and he answered in a way that was more about the state of knowable public knowledge rather than his personal estimate. But I agree that seems pretty hair-splitting.
As it is, I notice that I'm confused.
Why wouldn't he say something to the effect of the following?
...I don't know; this kind of forecasting is very difficult, timelines forecasting is very difficult. I can't speak with confidence one way or the other. However, my best guess from following the literature on this topic for many years is that the catastrophic concerns are credible. I don't know how probable it is, but does not seem to me that it is merely outlandish sci fi scenario that AI will lead to human extinction, and is not out of the question that
I think your interpretation is fairly uncharitable. If you have further examples of this deceptive pattern from those sympathetic to AI risk I would change my perspective but the speculation in the post plus this example weren't compelling:
I watched the video and firstly Senator Peters seems to trail off after the quoted part and ends his question by saying "What's your assessment of how fast this is going and when do you think we may be faced with those more challenging issues?". So straightforwardly his question is about timelines not about risk as you frame it. Indeed Matheny (after two minutes) literally responds "it's a really difficult question. I think whether AGI is nearer or farther than thought ..." (emphasis different to yours) so makes it likely to me Matheny is expressing uncertainty about timelines, not risk.
Overall I agree that this was an opportunity for Matheny to discuss AI x-risk and plausibly it wasn't the best use of time to discuss the uncertainty of the situation. But saying this is dishonesty doesn't seem well supported
No, the question was about whether there are apocalyptic risks and on what timeline we should be concerned about apocalyptic risks.
The questioner used the term 'apocalyptic' specifically. Three people answered the question, and the first two both also alluded to 'apocalyptic' risks and sort of said that they didn't really think we need to think about that possibility. Them referring to apocalyptic risks goes to show that it was a key part of what the questioner wanted to understand — to what extent these risks are real and on what timeline we'll need to react to them. My read is not that Matheny actively misled the speaker, but that he avoided answering, which is "hiding" rather than "lying" (I don't agree with the OP that they're identical).
I think the question was unclear so it was more acceptable to not directly address whether there is apocalyptic risk, but I think many people I know would have definitely said "Oh to be clear I totally disagree with the previous two people, there are definitely apocalyptic risks and we are not prepared for them and cannot deal with them after-the-fact (as you just mentioned being concerned about)."
Extra detail on what happened
I agree that it is important to be clear about the potential for catastrophic AI risk, and I am somewhat disappointed in the answer above (though I think calling "I don't know" lying is a bit of a stretch). But on the whole, I think people have been pretty upfront about catastrophic risk, e.g. Dario has given an explicit P(doom) publicly, all the lab heads have signed the CAIS letter, etc.
Notably, though, that's not what the original post is primarily asking for: it's asking for people to clearly state that they agree that we should pause/stop AI development, not to clearly state that that they think AI poses a catastrophic risk. I agree that people should clearly state that they think there's a catastrophic risk, but I disagree that people should clearly state that they think we should pause.
Primarily, that's because I don't actually think trying to get governments to enact some sort of a generic pause would make good policy. Analogizing to climate change, I think getting scientists to say publicly that they think climate change is a real risk helped the cause, but putting pressure on scientists to publicly say that environmentalism/degrowth/etc. would solve the problem has substantially hurt the cause (despite the fact that a magic button that halved consumption would probably solve climate change).
I agree with most of this, but I think the "Let me call this for what it is: lying for personal gain" section is silly and doesn't help your case.
The only sense in which it's clear that it's "for personal gain" is that it's lying to get what you want.
Sure, I'm with you that far - but if what someone wants is [a wonderful future for everyone], then that's hardly what most people would describe as "for personal gain".
By this logic, any instrumental action taken towards an altruistic goal would be "for personal gain".
That's just silly.
It's unhelpful too, since it gives people a somewhat legitimate reason to dismiss the broader point.
Of course it's possible that the longer-term altruistic goal is just a rationalization, and people are after power for its own sake, but I don't buy that this is often true - at least not in any clean [they're doing this and only this] sense. (one could have similar altruistic-goal-is-rationalization suspicions about your actions too)
In many cases, I think overconfidence is sufficient explanation.
And if we get into "Ah, but isn't it interesting that this overconfidence leads to power gain", then I'd agree - but then I claim that you should distinguish [con...
The only sense in which it's clear that it's "for personal gain" is that it's lying to get what you want.
Sure, I'm with you that far - but if what someone wants is [a wonderful future for everyone], then that's hardly what most people would describe as "for personal gain".
If Alice lies in order to get influence, with the hope of later using that influence for altruistic ends, it seems fair to call the influence Alice gets 'personal gain'. After all, it's her sense of altruism that will be promoted, not a generic one.
This is not what most people mean by "for personal gain". (I'm not disputing that Alice gets personal gain)
Insofar as the influence is required for altruistic ends, aiming for it doesn't imply aiming for personal gain.
Insofar as the influence is not required for altruistic ends, we have no basis to believe Alice was aiming for it.
"You're just doing that for personal gain!" is not generally taken to mean that you may be genuinely doing your best to create a better world for everyone, as you see it, in a way that many would broadly endorse.
In this context, an appropriate standard is the post's own:
Does this "predictably lead people to believe false things"?
Yes, it does. (if they believe it)
"Lying for personal gain" is a predictably misleading description, unless much stronger claims are being made about motivation (and I don't think there's sufficient evidence to back those up).
The "lying" part I can mostly go along with. (though based on a contextual 'duty' to speak out when it's unusually important; and I think I'd still want to label the two situations differently: [not speaking out] and [explicitly lying] may both be undesirable, but they're not the same thing)
(I don't really think in terms of duties, but it's a reasonable shorthand here)
I'm happy to state on the record that, if I had a magic button that I could press that would stop all AGI progress for 50 years, I would absolutely press that button. I don't agree with the idea that it's super important to trot everyone out and get them to say that publicly, but I'm happy to say it for myself.
I would like to observe to onlookers that you did in fact say something similar in your post on RSPs. Your very first sentence was:
Recently, there’s been a lot of discussion and advocacy around AI pauses—which, to be clear, I think is great: pause advocacy pushes in the right direction and works to build a good base of public support for x-risk-relevant regulation.
How do you feel about "In an ideal world, we'd stop all AI progress"? Or "ideally, we'd stop all AI progress"?
We should shut it all down.
We can't shut it all down.
The consequences of trying to shut it all down and failing, as we very likely would, could actually raise the odds of human extinction.
Therefore we don't know what to publicly advocate for.
These are the beliefs I hear expressed by most serious AI safety people. They are consistent and honest.
For instance, see https://forum.effectivealtruism.org/posts/JYEAL8g7ArqGoTaX6/ai-pause-will-likely-backfire.
That post makes two good points:
A pause would: 2) Increasing the chance of a “fast takeoff” in which one or a handful of AIs rapidly and discontinuously become more capable, concentrating immense power in their hands. 3) Pushing capabilities research underground, and to countries with looser regulations and safety requirements.
Obviously these don't apply to a permanent, complete shutdown. And they're not entirely convincing even for a pause.
My point is that the issue is complicated.
A complete shutdown seems impossible to maintain for all of humanity. Someone is going to build AGI. The question is who and how.
The call for more honesty is appreciated. We should be honest, and include "obviously we should just not do it". But you don't get many words when speaking publicly, so making those your primary point is a questionable strategy.
hiding your beliefs, in ways that predictably leads people to believe false things, is lying. This is the case regardless of your intentions, and regardless of how it feels.
I think people generally lie WAY more than we realize and most lies are lies of omission. I don't think deception is usually the immediate motivation but due to a kind of social convenience. Maintaining social equilibrium is valued over openness or honesty regarding relevant beliefs that may come up in everyday life.
ARC & Open Philanthropy state in a press release “In a sane world, all AGI progress should stop. If we don’t, there’s more than a 10% chance we will all die.”
Could you spell out what you mean by "in a sane world"? I suspect a bunch of people you disagree with do not favor a pause due to various empirical facts about the world (e.g., there being competitors like Meta).
hiding your beliefs, in ways that predictably leads people to believe false things, is lying
I think this has got to be tempered by Grice to be accurate. Like, if I don't bring up some unusual fact about my life in a brief conversation (e.g. that I consume iron supplements once a week), this predictably leads people to believe something false about my life (that I do not consume iron supplements once a week), but is not reasonably understood as the bad type of lie - otherwise to be an honest person I'd have to tell everyone tons of minutiae about myself ...
Upvoted, and thanks for writing this. I disagree on multiple dimensions - on the object level, I don't think ANY research topic can be stopped for very long, and I don't think AI specifically gets much safer with any achievable finite pause, compared to a slowdown and standard of care for roughly the same duration. On the strategy level, I wonder what other topics you'd use as support for your thesis (if you feel extreme measures are correct, advocate for them). US Gun Control? Drug legalization or enforcement? Private capital...
Counterpoint: we are better off using what political/social capital we have to advocate for more public funding in AI alignment. I think of slowing down AI capabilities research as just a means of buying time to get more AI alignment funding - but essentially useless unless combined with a strong effort to get money into AI alignment.
Hmm, I'm in favor of an immediate stop (and of people being more honest about their beliefs) but in my experience the lying / hiding frame doesn't actually describe many people.
This is maybe even harsher than what you said in some ways, but to me it feels more like even very bright alignment researchers are often confused and getting caught in shell games with alignment, postulating that we'll be able to build "human level" AI, which somehow just doesn't do a bunch of bad things that smart humans are clearly capable of. And if even the most technical peopl...
People who think that it's deontologically fine to remain silent might not come out and say it.
Consider what happens when a community rewards the people who gain more influence by lying!
This is widely considered a better form of government than hereditary aristocracies.
I agree with others to a large degree about the framing/tone/specific-words not being great, though I agree with a lot the post itself, but really that's what this whole post is about: that dressing up your words and saying partial in-the-middle positions can harm the environment of discussion. That saying what you truly believe then lets you argue down from that, rather than doing the arguing down against yourself - and implicitly against all the other people who hold a similar ideal belief as you. I've noticed similar facets of what the post gestures at,...
let us be clear: hiding your beliefs, in ways that predictably leads people to believe false things, is lying. This is the case regardless of your intentions, and regardless of how it feels.
Not only is it morally wrong, it makes for a terrible strategy. As it stands, the AI Safety Community itself can not coordinate to state that we should stop AGI progress right now!
Some dynamics and gears in world models are protected secrets, when they should be open-sourced and researched by more people, and other gears are open-sourced and researched by too many peopl...
I don't see the practical value of a post that starts off with conjecture rather than reality; i.e., "In a saner world...."
You clearly wish that things were different, that investors and corporate executives would simply stop all progress until ironclad safety mechanisms were in place, but wishing doesn't make it so.
Isn't the more pressing problem what can be done in the world that we have, rather than in a world that we wish we had?
Too many claimed to pursue the following approach:
- It would be great if AGI progress stopped, but that is infeasible.
- Therefore, I will advocate for what I think is feasible, even if it is not ideal.
- The Overton window being what it is, if I claim a belief that is too extreme, or endorse an infeasible policy proposal, people will take me less seriously on the feasible stuff.
- Given this, I will be tactical in what I say, even though I will avoid stating outright lies.
I think if applied strictly to people identified by this list, the post is reasonable. I ...
I think politics often involves bidding for the compromise you think is feasible, rather than what you'ld ideally want.
whats maybe different in the AI risk case, and others like it, is how you'll be regarded when things go wrong.
hypothetical scenario
(Co-written by Connor Leahy and Gabe)
We have talked to a whole bunch of people about pauses and moratoriums. Members of the AI safety community, investors, business peers, politicians, and more.
Too many claimed to pursue the following approach:
Consider if this applies to you, or people close to you.
If it does, let us be clear: hiding your beliefs, in ways that predictably leads people to believe false things, is lying. This is the case regardless of your intentions, and regardless of how it feels.
Not only is it morally wrong, it makes for a terrible strategy. As it stands, the AI Safety Community itself can not coordinate to state that we should stop AGI progress right now!
Not only can it not coordinate, the AI Safety Community is defecting, by making it more costly for people who do say it to say it.
We all feel like we are working on the most important things, and that we are being pragmatic realists.
But remember: If you feel stuck in the Overton window, it is because YOU ARE the Overton window.
—
1. The AI Safety Community is making our job harder
In a saner world, all AGI progress should have already stopped. If we don’t, there’s more than a 10% chance we all die.
Many people in the AI safety community believe this, but they have not stated it publicly. Worse, they have stated different beliefs more saliently, which misdirect everyone else about what should be done, and what the AI safety community believes.
To date, in our efforts to inform, motivate and coordinate with people: People in the AI Safety Community publicly lying has been one of the biggest direct obstacles we have encountered.
The newest example of this is ”Responsible Scaling Policies”, with many AI Safety people being much more vocal about their endorsement of RSPs than their private belief that in a saner world, all AGI progress should stop right now.
Because of them, we have been told many times that we are a minority voice, and that most people in the AI Safety community (understand, Open Philanthropy adjacent) disagree that we should stop all AGI progress right now.
That actually, there is an acceptable way to continue scaling! And given that this makes things easier, if there is indeed an acceptable way to continue scaling, this is what we should do, rather than stop all AGI progress right now!
Recently, Dario Amodei (Anthropic CEO), has used the RSP to frame the moratorium position as the most extreme version of an extreme position, and this is the framing that we have seen used over and over again. ARC mirrors this in their version of the RSP proposal, describing itself as a “pragmatic middle ground” between a moratorium and doing nothing.
Obviously, all AGI Racers use this against us when we talk to people.
There are very few people that we have consistently seen publicly call for a stop to AGI progress. The clearest ones are Eliezer’s “Shut it All Down” and Nate’s “Fucking stop”.
The loudest silence is from Paul Christiano, whose RSPs are being used to safety-wash scaling.
Proving me wrong is very easy. If you do believe that, in a saner world, we would stop all AGI progress right now, you can just write this publicly.
When called out on this, most people we talk to just fumble.
2. Lying for Personal Gain
We talk to many people who publicly lie about their beliefs.
The justifications are always the same: “it doesn’t feel like lying”, “we don’t state things we do not believe”, “we are playing an inside game, so we must be tactical in what we say to gain influence and power”.
Let me call this for what it is: lying for personal gain. If you state things whose main purpose is to get people to think you believe something else, and you do so to gain more influence and power: you are lying for personal gain.
The results of this “influence and power-grabbing” has many times over materialised with the safety-washing of the AGI race. What a coincidence it is that DeepMind, OpenAI and Anthropic are all related to the AI Safety community.
The only benefit we see from this politicking is the people lying gain more influence, while the time we have left to AGI keeps getting shorter.
Consider what happens when a community rewards the people who gain more influence by lying!
—
So many people lie, and they screw not only humanity, but one another.
Many AGI corp leaders will privately state that in a saner world, AGI progress should stop, but they will not state it because it would hurt their ability to race against each other!
Safety people will lie so that they can keep ties with labs in order to “pressure them” and seem reasonable to politicians.
Whatever: they just lie to gain more power.
“DO NOT LIE PUBLICLY ABOUT GRAVE MATTERS” is a very strong baseline. If you want to defect, you need a much stronger reason than “it will benefit my personal influence, and I promise I’ll do good things with it”.
And you need to accept the blame when you’re called out. You should not muddy the waters by justifying your lies, covering them, telling people they misunderstood, and try to maintain more influence within the community.
We have seen so many people be taken in this web of lies: from politicians and journalists, to engineers and intellectuals, all up until the concerned EA or regular citizen who wants to help, but is confused by our message when it looks like the AI safety community is ok with scaling.
Your lies compound and make the world a worse place.
There is an easy way to fix this situation: we can adopt the norm of publicly stating our true beliefs about grave matters.
If you know someone who claims to believe that in a saner world we should stop all AGI progress, tell them to publicly state their beliefs, unequivocally. Very often, you’ll see them fumbling, caught in politicking. And not that rarely, you’ll see that they actually want to keep racing. In these situations, you might want to stop finding excuses for them.
3. The Spirit of Coordination
A very sad thing that we have personally felt is that it looks like many people are so tangled in these politics that they do not understand what the point of honesty even is.
Indeed, from the inside, it is not obvious that honesty is a good choice. If you are honest, publicly honest, or even adversarially honest, you just make more opponents, you have less influence, and you can help less.
This is typical deontology vs consequentialism. Should you be honest, if from your point of view, it increases the chances of doom?
The answer is YES.
a) Politicking has many more unintended consequences than expected.
Whenever you lie, you shoot potential allies at random in the back.
Whenever you lie, you make it more acceptable for people around you to lie.
b) Your behavior, especially if you are a leader, a funder or a major employee (first 10 employees, or responsible for >10% of the headcount of the org), ripples down to everyone around you.
People lower in the respectability/authority/status ranks do defer to your behavior.
People outside of these ranks look at you.
Our work toward stopping AGI progress becomes easier whenever a leader/investor/major employee at Open AI, DeepMind, Anthropic, ARC, Open Philanthropy, etc. states their beliefs about AGI progress more clearly.
c) Honesty is Great.
Existential Risks from AI are now going mainstream. Academics talk about it. Tech CEOs talk about it. You can now talk about it, not be a weirdo, and gain more allies. Polls show that even non-expert citizens express diverse opinions about super intelligence.
Consider the following timeline:
Whenever you lie for personal gain, you fuck up this timeline.
When you start being publicly honest, you will suffer a personal hit in the short term. But we truly believe that, coordinated and honest, we will have timelines much longer than any Scaling Policy will ever get us.