Should there be just one western AGI project?

rosehadshar; Tom Davidson

Tom Davidson did the original thinking; Rose Hadshar helped with later thinking, structure and writing.

Some plans for AI governance involve centralising western AGI development.^[1] Would this actually be a good idea? We don’t think this question has been analysed in enough detail, given how important it is. In this post, we’re going to:

Explore the strategic implications of having one project instead of several
Discuss what we think the best path forwards is, given that strategic landscape

(If at this point you’re thinking ‘this is all irrelevant, because centralisation is inevitable’, we disagree! We suggest you read the appendix, and then consider if you want to read the rest of the post.)

On 2, we’re going to present:

Our overall take: It’s very unclear whether centralising would be good or bad.
Our current best guess: Centralisation is probably net bad, because of risks from power concentration (but this is very uncertain).

Overall, we think the best path forward is to increase the chances we get to good versions of either a single or multiple projects, rather than to increase the chances we get a centralised project (which could be good or bad). We’re excited about work on:

Interventions which are robustly good whether there are one or multiple AGI projects.
Governance structures which avoid the biggest downsides of single and/or multiple project scenarios.

What are the strategic implications of having one instead of several projects?

What should we expect to vary with the number of western AGI development projects?

At a very abstract level, if we start out with some blobs, and then mush them into one blob, there are a few obvious things that change:

There are fewer blobs in total. So interaction dynamics will change. In the AGI case, the important interaction dynamics to consider are:
- Race dynamics, and
- Power concentration.
There’s less surface area. So it might be easier to protect our big blob. In the AGI case, we can think about this in terms of infosecurity.

Summary table

Variable	Implications of one project	Uncertainties^[2]
Race dynamics	Less racing between western projects - No competing projects Unclear implications for racing with China: - US might speed up or slow down - China might speed up too	Do ‘races to the top’ on safety outweigh races to the bottom? How effectively can government regulation reduce racing between multiple western projects? Will the speedup from compute amalgamation outweigh other slowdowns for the US? How much will China speed up in response to US centralisation? How much stronger will infosecurity be for a centralised project?
Power concentration	Greater concentration of power: - No other western AGI projects - Less access to advanced AI for the rest of the world - Greater integration with USG	How effectively can a single project make use of: - Market mechanisms? - Checks and balances? How much will power concentrate anyway with multiple projects?
Infosecurity	Unclear implications for infosecurity: - Fewer systems but not necessarily fewer security components overall - More resources, but USG provision or R&D breakthroughs could mitigate this for multiple projects - Might provoke larger earlier attacks	How much bigger will a single project be? How strong can infosecurity be for multiple projects? Will a single project provoke more serious attacks?

Race dynamics

One thing that changes if western AGI development gets centralised is that there are fewer competing AGI projects.

When there are multiple AGI projects, there are incentives to move fast to develop capabilities before your competitors do. These incentives could be strong enough to cause projects to neglect other features we care about, like safety.

What would happen to these race dynamics if the number of western AGI projects were reduced to one?

Racing between western projects

At first blush, it seems like there would be much less incentive to race between western projects if there were only one project, as there would be no competition to race against.

This effect might not be as big as it initially seems though:

Racing between teams. There could still be some racing between teams within a single project.
Regulation to reduce racing. Government regulation could temper racing between multiple western projects. So there are ways to reduce racing between western projects, besides centralising.

Also, competition can incentivise races to the top as well as races to the bottom. Competition could create incentives to:

Scrutinise competitors’ systems.
Publish technical AI safety work, to look more responsible than competitors.
Develop safer systems, to the extent that consumers desire this and can tell the difference.

It’s not clear how races to the top and races to the bottom will net out for AGI, but the possibility of races to the top is a reason to think that racing between multiple western AGI projects wouldn’t be as negative as you’d otherwise think.

Having one project would mean less racing between western projects, but maybe not a lot less (as the counterfactual might be well-regulated projects with races to the top on safety).

Racing between the US and China

How would racing between the US and China change if the US only had one AGI project?

The main lever that could change the amount of racing is the size of the lead between the US and China: the bigger the US’s lead, the less incentive there is for the US to race (and the smaller the lead, the more there’s an incentive).^[3]

Somewhat paradoxically, this means that speeding up US AGI development could reduce racing, as the US has a larger lead and so can afford to go more slowly later.

Speeding up US AGI development gives the US a bigger lead, which means they have more time to pause later and can afford to race less.

At first blush, it seems like centralising US AGI development would reduce racing with China, because amalgamating all western compute would speed up AGI development.

However, there are other effects which could counteract this, and it’s not obvious how they net out:

China might speed up too. Centralising western AGI development might prompt China to do the same. So the lead might remain the same (or even get smaller, if you expect China to be faster and more efficient at centralising AGI development).
The US might slow down for other reasons. It’s not clear how the speedup from compute amalgamation nets out with other factors which might slow the US down:
- Bureaucracy. A centralised project would probably be more bureaucratic.
- Reduced innovation. Reducing the number of projects could reduce innovation.
- Chinese attempts to slow down US AGI development. Centralising US AGI development might provoke Chinese attempts to slow the US down (for example, by blockading Taiwan).
Centralising might make the US less likely to pause at the crucial time. If part of the reason for centralising is to develop AGI before China, it might become politically harder for the US to slow down at the crucial time even if the lead is bigger than counterfactually (because there’s a stronger narrative about racing to beat China).
Infosecurity. Centralising western AGI development would probably make it harder for China to steal model weights, but it might also prompt China to try harder to do so. (We discuss infosecurity in more detail below.)

So it’s not clear whether having one project would increase or decrease racing between the US and China.

Why do race dynamics matter?

Racing could make it harder for AGI projects to:

Invest in AI safety in general
Slow down or pause at the crucial time between human-level and superintelligent AI, when AI first poses an x-risk and when AI safety and governance work is particularly valuable

This would increase AI takeover risk, risks from proliferation, and the risk of coups (as mitigating all of these risks takes time and investment).

It might also matter who wins the race, for instance if you think that some projects are more likely than others to:

Invest in safety (reducing AI takeover risk)
Invest in infosecurity (reducing risks from proliferation)
Avoid robust totalitarianism
Lead to really good futures

Many people think that this means it’s important for the US to develop AGI before China. (This is about who wins the race, not strictly about how much racing there is. But these things are related: the more likely the US is to win a race, the less intensely the US needs to race.^[4])

Power concentration

If western AGI development gets centralised, power would concentrate: the single project would have a lot more power than any individual project in a multiple project scenario.

There are a few different mechanisms by which centralising would concentrate power:

Removing competing AGI projects. This has two different effects:
- The single project amasses more resources.
- Some of the constraints on the single project are removed:
  - There are fewer actors with the technical expertise and incentives to expose malpractice.
  - It removes incentives to compete on safety and on guarantees that AI systems are aligned with the interests of broader society (rather than biased towards promoting the interests of their developers).
Reducing access to advanced AI services. Competition significantly helps people get access to advanced tech: it incentivises selling better products sooner to more people for less money. A single project wouldn’t naturally have this (very strong) incentive. So we should expect less access to advanced AI services than if there are multiple projects.
- Reducing access to these services will significantly disempower the rest of the world: we’re not talking about whether people will have access to the best chatbots or not, but whether they’ll have access to extremely powerful future capabilities which enable them to shape and improve their lives on a scale that humans haven’t previously been able to.

If multiple projects compete to sell AI services to the rest of the world, the rest of the world will be more empowered.

Increasing integration with USG. We expect that AGI projects will work closely with the USG, however many projects there are. But there’s a finite amount of USG bandwidth: there’s only one President, for example. So the fewer projects there are, the more integration we should expect. This further concentrates power:
- The AGI project gets access to more non-AI resources.
- The USG and the AGI project become less independent.

With multiple projects there would be more independent centres of power (red diamonds).

How much more concentrated would power be if western AGI development were centralised?

Partly, this depends on how concentrated power would become in a multiple project scenario: if power would concentrate significantly anyway, then the additional concentration from centralisation would be less significant. (This is related to how inevitable a single project is - see this appendix.)

And partly this depends on how easy it is to reduce power concentration by designing a single project well.^[5] A single project could be designed with:

Market mechanisms, which would increase access to advanced AI services.
- For example, many companies could be allowed to fine-tune the project’s models, and compete to sell API access.
Checks and balances to
- Limit integration with the USG.
- Increase broad transparency into its AI capabilities and risk analyses.
- Limit the formal rights and authorities of the project.
  - For example, restricting the project’s rights to make money, or to take actions like paying for ads or investing in media companies or political lobbying.

But these mechanisms would be less robust than having multiple projects at reducing power concentration: any market mechanisms and checks and balances would be a matter of policy, not competitive survival, so they would be easier to go back on.

Having one project might massively increase power concentration, but also might just increase it a bit (if it’s possible to have a well-designed centralised project with market mechanisms and checks and balances).

Why does power concentration matter?

Power concentration could:

Reduce pluralism. Power concentration means that fewer actors are empowered (with AI capabilities and resources). The many would have less influence and less chance to flourish. This is unfair, and it probably makes it less likely that humanity reflects collectively and explores many different kinds of future.
Increase coup risk and the chance of a permanent dictatorship. At an extreme, power concentration could allow a small group to seize permanent control.
- Power concentration increases the risk of a coup in the US, because:
  - It’s easier for a single project to retain privileged access to the most advanced systems.
    - There are no competing western projects with similar capabilities.
    - There’s no incentive for the project to sell its most advanced systems to keep up with the competition.
  - It’s probably easier for a single project to install secret loyalties undetected,^[6] as there are fewer independent actors with the technical expertise to expose it.^[7]
  - There are fewer centres of power and a single project would be more closely integrated with the USG.
- If growth is sufficiently explosive, then a coup of the USG could lead to:
  - Permanent dictatorship.
  - Taking over the world.

Infosecurity

Another thing that changes if western AGI development gets centralised is that there’s less attack surface:

There are fewer security systems which could be compromised or fail.
There might be fewer individual AI models to secure.

Some attack surface scales with the number of projects.

At the same time, a single project would probably have more resources to devote to infosecurity:

Government resources. A single project is likely to be closely integrated with USG, and so to have access to the highest levels of government infosecurity.
Total resources. A single project would have more total resources to spend on infosecurity than any individual project in a multiple project scenario, as it would be bigger.

So all else equal, it seems that centralising western AGI development would lead to stronger infosecurity.

But all else might not be equal:

A single project might motivate more serious attacks, which are harder to defend against.
- It might also motivate earlier attacks, such that the single project would have less total time to get security measures into place.
There are ways to increase the infosecurity of multiple projects:
- The USG might provide or mandate strong infosecurity for multiple projects.
  - The USG might be motivated to provide this, to the extent that it wants to prevent China stealing model weights.
  - If the USG set very high infosecurity standards, and put the burden on private companies to meet them, companies might be motivated to do so given the massive economic incentives.^[8]
- R&D breakthroughs might lower the costs of strong infosecurity, making it easier for multiple projects to access.
A single project could have more attack surface, if it’s sufficiently big. Some attack surface scales with the number of projects (like the number of security systems), but other kinds of attack surface scale with total size (like the number of people or buildings). If a single project were sufficiently bigger than the sum of the counterfactual multiple projects, it could have more attack surface and so be less infosecure.

If a single project is big enough, it would have more attack surface than multiple projects (as some attack surface scales with total size).

It’s not clear whether having one project would reduce the chance that the weights are stolen. . We think that it would be harder to steal the weights of a single project, but the motivation to do so would also be stronger – it’s not clear how these balance out.

Why does infosecurity matter?

The stronger infosecurity is, the harder it is for:

Any actor to steal model weights.
- This reduces risks from proliferation: it’s harder for bad actors to get access to GCR-enabling technologies. The more it’s the case that AI enables strongly offence dominant technologies, the more important this point is.
China in particular to steal model weights.
- This effectively increases the size of the US lead over China (as you don’t need to discount the actual lead by as much to account for the possibility of China stealing the weights), which:
  - Probably reduces racing (which reduces AI takeover risk),^[9] and
  - Increases the chance that the US develops AGI before China.

If we’re right that centralising western AGI development would make it harder to steal the weights, but also increase the motivation to do so, then the effect of centralising might be more important for reducing proliferation risk than for preventing China stealing the weights:

If it’s harder to steal the weights, fewer actors will be able to do so.
China is one of the most resourced and competent actors, and would have even stronger motivation to steal the weights than other actors (because of race dynamics).
So it’s more likely that centralising reduces proliferation risk, and less likely that it reduces the chance of China stealing the weights.

What is the best path forwards, given that strategic landscape?

We’ve just considered a lot of different implications of having a single project instead of several. Summing up, we think that:

It’s not clear what the sign of having a single project would be for racing between the US and China and infosecurity.
Having a single project would probably lead to less racing between western companies and greater power concentration (but it’s not clear what the effect size here would be).

So, given this strategic landscape, what’s the best path forwards?

Our overall take: It’s very unclear whether centralising would be good or bad.
Our current best guess: Centralisation is probably net bad, because of risks from power concentration (but this is very uncertain).

Our overall take

It’s very unclear whether centralising would be good or bad.

It seems to us that whether or not western AGI development is centralised could have large strategic implications. But it’s very hard to be confident in what the implications will be. Centralising western AGI development could:

Increase or decrease AI takeover risk, depending on whether it exacerbates racing with China or not and on how bad (or good) racing between multiple western projects would be.
Make it more or less likely that the US develops AGI before China, depending on whether it slows the US down, how much it causes China to speed up, and how much stronger the infosecurity of a centralised project would be.
Reduce risks from proliferating dangerous technologies by a little or a lot, depending on the attack surface of the single project, whether attacks get more serious following centralisation, and how strong the infosecurity of multiple projects would be.
Increase risks from power concentration by a little or a lot, depending on how well-designed the centralised project would be.

It’s also unclear what the relative magnitudes of the risks are in the first place. Should we prefer a world where the US is more likely to beat China but also more likely to slide into dictatorship, or a world where it’s less likely to beat China but also less likely to become a dictatorship? If centralising increases AI takeover risk but only by a small amount, and greatly increases risks from power concentration, what should we do? The trade-offs here are really hard.

We have our own tentative opinions on this stuff (below), but our strongest take here is that it’s very unclear whether centralising would be good or bad. If you are very confident that centralising would be good — you shouldn’t be.

Our current best guess

We think that the overall effect of centralising AGI development is very uncertain, but it still seems useful to put forward concrete best guesses on the object-level, so that others can disagree and we can make progress on figuring out the answer.

Our current best guess is that centralisation is probably net bad because of risks from power concentration.

Why we think this:

Centralising western AGI development would be an extreme concentration of power.
- There would be no competing western AGI projects, the rest of the world would have worse access to advanced AI services, and the AGI project would be more integrated with the USG.
- A single project could be designed with market mechanisms and checks and balances, but we haven’t seen plans which seem nearly robust enough to mitigate these challenges. What concretely would stop a centralised project from ignoring the checks and balances once it has huge amounts of power?
Power concentration is a major way we could lose out on a lot of the value of the future.
- It makes it less likely that we end up with good reflective processes leading to a pluralist future.
- It increases the risk of a coup leading to permanent dictatorship.
The benefits of centralising probably don’t outweigh these costs.
- Centralising wouldn’t significantly reduce AI takeover risk.
  - Racing between the US and China would probably intensify, as centralising would prompt China to speed up AGI development and to try harder to steal model weights.
  - The counterfactual wouldn’t be that bad:
    - It seems pretty feasible to regulate the bad effects of racing between western AGI projects.
    - Races to the top will counteract races to the bottom to some extent.
Centralising wouldn’t significantly increase the chances that the US develops AGI before China (as it would prompt China to speed up AGI development and to try harder to steal model weights).
Multiple western AGI projects could have good enough infosecurity to prevent proliferation or China stealing the weights.
- The USG will be able and willing to either provide or mandate strong infosecurity for multiple projects.

But because there’s so much uncertainty, we could easily be wrong. These are the main ways we are tracking that our best guess could be wrong:

Infosecurity. It might be possible to have good enough infosecurity with a centralised project, but not with multiple projects.
- We don’t see why the USG can’t secure multiple projects in principle, but we’re not experts here.
Regulating western racing. Absent a centralised project, regulation may be very unlikely to reduce problematic western racing.
- Good regulation is hard, especially for an embryonic field like catastrophic risks from AI.
US China racing. Maybe China wouldn’t speed up as much as the US would, such that centralising would increase the US lead, reduce racing, and reduce AI takeover risk.
- This could be the case if China just doesn’t have good enough compute access to compete, or if the US centralisation process is surprisingly efficient.
Designing a single project. There might be robust mechanisms for a single project to sell advanced AI services and manage coup risk.
- We haven’t seen anything very convincing here, but we’d be excited to see more work here.
Trade-offs between risks. The probability-weighted impacts of AI takeover or the proliferation of world-ending technologies might be high enough to dominate the probability-weighted impacts of power concentration.
- We currently doubt this, but we haven’t modelled it out, and we have lower p(doom) from misalignment than many (<10%).
Accessible versions. A really good version of a centralised project might become accessible, or the accessible versions of multiple projects might be pretty bad.
- This seems plausible. Note that unfortunately good versions of both scenarios are probably correlated, as they both significantly depend on the USG doing a good job (of managing a single project, or of regulating multiple projects).
Inevitability. Centralisation might be more inevitable than we thought, such that this argument is moot (see appendix).
- Inevitability is a really strong claim, and we are currently not convinced. But this might become clearer over time, and we’re not very close to the USG.

Overall conclusion

The variation between good and bad versions of these projects seems much more significant than the variation from whether or not projects are centralised.

A centralised project could be:

A monopoly which gives unchecked power to a small group of people who are also very influential in the USG.
A well-designed and accountable project with market mechanisms and thorough checks and balances on power.
Anything in between.

A multiple project scenario could be:

Subject to poorly-targeted regulation which wastes time and safety resources, without preventing dangerous outcomes.
Subject to well-targeted regulation which reliably prevents dangerous outcomes and supports infosecurity.

It’s hard to tell whether a centralised project is better or worse than multiple projects as an overall category; it’s easy to tell within categories which scenarios we’d prefer.

We’re excited about work on:

Interventions which are robustly good whether there are one or multiple AGI projects. For example:
- Robust processes to prevent AIs from having secret loyalties.
- R&D into improved infosecurity.
Governance structures which avoid the biggest downsides of single and/or multiple project scenarios. For example:
- A centralised project could be carefully designed to minimise its power, e.g. by only allowing it to do pre-training and safety testing, and requiring it to share access to near-SOTA models with multiple private companies who can fine-tune and sell access more broadly.
- Multiple projects could be carefully governed to prevent racing to the bottom on safety, e.g. by requiring approval of a centralised body to train significantly more capable AI.

For extremely helpful comments on earlier drafts, thanks to Adam Bales, Catherine Brewer, Owen Cotton-Barratt, Max Dalton, Lukas Finnveden, Ryan Greenblatt, Will MacAskill, Matthew van der Merwe, Toby Ord, Carl Shulman, Lizka Vaintrob, and others.

Appendix: Why we don’t think centralisation is inevitable

A common argument for pushing to centralise western AGI development is that centralisation is basically inevitable, and that conditional on centralisation happening at some point, it’s better to push towards good versions of a single project sooner rather than later.

We agree with the conditional, but don’t think that centralisation is inevitable.

The main arguments we’ve heard for centralisation being inevitable are:

Economies of scale. Gains from scale might cause AGI development to centralise eventually (e.g. if training runs become too expensive/compute-intensive/energy-intensive to do otherwise).
Inevitability of a decisive strategic advantage (DSA).^[10] The leading project might get a DSA at some point due to recursive self-improvement after automating AI R&D.
Government involvement. The national security implications of AGI might cause the USG to centralise AGI development eventually.

These arguments don’t convince us:

Economies of scale point to fewer but not necessarily one project. There will be pressure towards fewer projects as training runs become more expensive/compute-intensive/energy-intensive. But it’s not obvious that this will push all the way to a single project:
- Ratio of revenues to costs. If revenues from AGI are more than double the costs of training AGI, then there are incentives for more projects to enter (because even if you split the revenues in half, they still cover the costs of training for two projects).
- Market inefficiencies. It could still be most efficient to have a single project even if revenues are double the costs (because less money is spent for the same returns) — but the free market isn’t always maximally efficient.
- Benefits of innovation. Alternatively, it could be more efficient to have multiple projects, because competition increases innovation by enough to outweigh the additional costs.
- Antitrust. By default, there’s strong legal pressure against monopolies.
There are ways of preventing a decisive strategic advantage (DSA). There is a risk that recursive self-improvement after automating AI R&D could enable the leading AGI project to get a decisive strategic advantage, which would effectively centralise all power in that project. We’re very concerned by this risk, but think that there are many ways to prevent the leading project getting a DSA.
- For example:
  - External oversight. Have significant external oversight into how the project trains and/or deploys AI to prevent the project from seeking influence in illegitimate ways.
  - Cheap API access. Require the leader to provide cheap API access to near-SOTA models (to prevent them hoarding their capabilities or charging high monopoly prices on their unrivalled AI services).
  - Weight-sharing. Require them to sell the weights of models trained with 10X less effective FLOP than their best model to other AGI projects, limiting how large a lead they can build up over competitors.
- We think countermeasures like these could bring the risk of a DSA down to very low levels, if implemented well. Even so, the leading AGI project would still be hugely powerful.
Government involvement =/= centralisation. The government will very likely want to be heavily involved in AGI development for natsec reasons, but there are other ways of doing this besides a single project, like defence contracting and public-private partnerships.

So, while we still think that centralisation is plausible, we don’t think that it’s inevitable.

^{^}
Centralising: either merging all existing AGI development projects, or shutting down all but the leading project. Either of these would require substantial US government (USG) involvement, and could involve the USG effectively nationalising the project (though there’s a spectrum here, and the lower end seems particularly likely).
Western: we’re mostly equating western with US. This is because we’re assuming that:
- Google DeepMind is effectively a US company because most of its data centres are in the US.
- Timelines are short enough that there are no plausible AGI developers outside the US and China.
We don’t think that these assumptions change our conclusions much. If western AGI projects were spread out beyond the US, then this would raise the benefits of centralising (as it’s harder to regulate racing across international borders), but also increase the harms (as centralising would be a larger concentration of power on the counterfactual) and make centralisation less likely to happen.
^{^}
An uncertainty which cuts across all of these variables is what version of a centralised project/multiple project scenario we would get.
^{^}
This is more likely to be true to the extent that:
1. There are winner-takes-all dynamics.
2. The actors are fully rational.
3. The perceived lead matches the actual lead.
It seems plausible that 2 and 3 just add noise, rather than systematically pushing towards more or less racing.
^{^}
Even if you don’t care who wins, you might prefer to increase the US lead to reduce racing. Though as we saw above, it’s not clear that centralising western AGI development actually would increase the US lead.
^{^}
There are also scenarios where having a single project reduces power concentration even without being well-designed: if failing to centralise would mean that US AGI development was so far ahead of China that the US was able to dominate, but centralising would slow the US down enough that China would also have a lot of power, then having a single project would reduce power concentration by default.
There are a lot of conditionals here, so we’re not currently putting much weight on this possibility. But we’re noting it for completeness, and in case others think there are reasons to put more weight on it.
^{^}
By ‘secret loyalties’, we mean undetected biases in AI systems towards the interests of their developers or some small cabal of people. For example, AI systems which give advice which subtly tends towards the interests of this cabal, or AI systems which have backdoors.
^{^}
A factor which might make it easier to install secret loyalties with multiple projects is racing: CEOs might have an easier time justifying moving fast and not installing proper checks and balances, if competition is very fierce.
^{^}
Though these standards might be hard to audit, which would make compliance harder to achieve.
^{^}
There are a few ways that making it harder for China to steal the model weights might not reduce racing:
- Centralising might simultaneously cause China to speed up its AGI development, and make it harder to steal the weights. It’s not clear how these effects would net out.
- What matters is the perceived size of the lead. The US could be poorly calibrated about how hard it is for China to steal the weights or about how that nets out with China speeding up AGI development, such that the US doesn’t race less even though it would be rational to do so.
- If it were very easy for China to steal the weights, this would reduce US incentives to race. (Note that this would probably be very bad for proliferation risk, and so isn’t very desirable.)
We still think that making the weights harder to steal would probably lead to less racing, as the US would feel more secure - but this is a complicated empirical question.
^{^}
Bostrom defines DSA as “a level of technological and other advantages sufficient to enable it to achieve complete world domination” in Superintelligence. Tom tends to define having a DSA as controlling >99% of economic output, and being able to do so indefinitely.

Thanks for writing this, I think it's an important topic which deserves more attention. This post covers many arguments, a few of which I think are much weaker than you all state. But more importantly, I think you all are missing at least one important argument. I've been meaning to write this up, and I'll use this as my excuse.

TL;DR: More independent AGI efforts means more risky “draws” from a pool of potential good and bad AIs; since a single bad draw could be catastrophic (a key claim about offense/defense), we need fewer, more controlled projects to minimize that danger.

The argument is basically an application of the Vulnerability World Hypothesis to AI development. You capture part of this argument in the discussion of Racing, but not the whole thing. So the setup is that building any particular AGI is drawing a ball from the urn of potential AIs. Some of these AIs are aligned, some are misaligned — we probably disagree about the proportions here but that's not crucial, and note that the proportion depends on a bunch of other aspects about the world such as how good our AGI alignment research is. More AGI projects means more draws from the urn and a higher likeliho... (read more)

3Tom Davidson1y

I agree with Rose's reply, and would go further. I think there are many actions that just one responsible lab could take that would completely change the game board: * Find and share a scalable solution to alignment * Provide compelling experimental evidence that standard training methods lead to misaligned power-seeking AI by default * Develop and share best practices for responsible scaling that are both commercially viable and safe. You comment argues that "one bad apple spoils the bunch", but it's also plausible that "one good apple saves the bunch"

5Aaron_Scher1y

I agree it's plausible. I continue to think that defensive strategies are harder than offensive ones, except the ones that basically look like centralized control over AGI development. For example, Then what? The government steps in and stops other companies from scaling capabilities until big safety improvements have been made? That's centralization along many axes. Or maybe all the other key decision makers in AGI projects get convinced by evidence and reason and this buys you 1-3 years until open source / many other actors reach this level of capabilities. Sharing an alignment solution involves companies handing over valuable IP to their competitors. I don't want to say it's impossible, but I have definitely gotten less optimistic about this in the last year. I think in the last year we have not seen a race to the top on safety, in any way. We have not seen much sharing of safety research that is relevant to products (or like, applied alignment research). We have instead mostly seen research without direct applications: interp, model organisms, weak-to-stong / scalable oversight (which is probably the closest to product relevance). Now sure, the stakes are way higher with AGI/ASI so there's a bigger incentive to share, but I don't want to be staking the future on these companies voluntarily giving up a bunch of secrets, which would be basically a 180 from their current strategy. I fail to see how developing and sharing best practices for RSPs will shift the game board. Except insofar as it involves key insights on technical problems (e.g., alignment research that is critical for scaling) which hits the IP problem. I don't think we've seen a race to the top on making good RSPs, but we have definitely seen pressure to publish any RSP. Not enough pressure; the RSPs are quite weak IMO and some frontier AI developers (Meta, xAI, maybe various Chinese orgs count) have none. I agree that it's plausible that "one good apple saves the bunch", but I don't think it'

2Tom Davidson1y

Quick clarification on terminology. We've used 'centralised' to mean "there's just one project doing pre-training". So having regulations that enforce good safety practice or gate-keep new training runs don't count. I think this is a more helpful use of the term. It directly links to the power concentration concerns we've raised. I think the best versions of non-centralisation will involve regulations like these but that's importantly different from one project having sole control of an insanely powerful technology. Compelling experimental evidence Currently there's no basically no empirical evidence that misaligned power-seeking emerges by default, let alone scheming. If we got strong evidence that scheming happens by default then I expect that all projects would do way more work to check for and avoid scheming, whether centralised or not. Attitudes change on all levels: project technical staff, technical leadership, regulators, open-source projects. You can also iterate experimentally to understand the conditions that cause scheming, allowing empirical progress on scheming like was never before possible. This seems like a massive game changer to me. I truly believe that if we picked one of today's top-5 labs at random and all the others were closed, this would be meaningfully less likely to happen and that would be a big shame. Scalable alignment solution You're right there's IP reasons against sharing. I believe it would be in line with many company's missions to share, but they may not. Even so, there's a lot you can do with aligned AGI. You could use it to produce compelling evidence about whether other AIs are aligned. You could find a way of proving to the world that your AI is aligned, which other labs can't replicate, giving you economic advantage. It would be interesting to explore threats models where AI takes over despite a project solving this, and it doesn't seem crazy, but i'd predict that we'd conclude the odds are better than if there'

1Aaron_Scher1y

Thanks for your continued engagement. I appreciate your point about compelling experimental evidence, and I think it's important that we're currently at a point with very little of that evidence. I still feel a lot of uncertainty here, and I expect the evidence to basically always be super murky and for interpretations to be varied/controversial, but I do feel more optimistic than before reading your comment. I don't expect this to be a very large effect. It feels similar to an argument like "company A will be better on ESG dimensions and therefore more and customers will switch to using it". Doing a quick review of the literature on that, it seems like there's a small but notable change in consumer behavior for ESG-labeled products. In the AI space, it doesn't seem to me like any customers care about OpenAI's safety team disappearing (except a few folks in the AI safety world). In this particular case, I expect the technical argument needed to demonstrate that some family of AI systems are aligned while others are not is a really complicated argument; I expect fewer than 500 people would be able to actually verify such an argument (or the initial "scalable alignment solution"), maybe zero people. I realize this is a bit of a nit because you were just gesturing toward one of many ways it could be good to have an alignment solution. I endorse arguing for alternative perspectives and appreciate you doing it. And I disagree with your synthesis here.

1Tom Davidson1y

It seems quite different to the ESG case. Customers don't personally benefit from using a company with good ESG. They will benefit from using an aligned AI over a misaligned one. Again though, customers currently have no selfish reason to care. It's quite common for only a very small number of ppl to have the individual ability to verify a safety case, but many more to defer to their judgement. People may defer to an AISI, or a regulatory agency.

4Aaron_Scher10mo

Out of curiosity, have your takes here changed much lately? I think the o3+ saga has updated me a small-medium amount toward "companies will just deploy misaligned AIs and consumers will complain but use them anyway" (evidenced by deployment of models that blatantly lie from multiple companies) and "slightly misaligned AI systems that are very capable will likely be preferred over more aligned systems that are less capable" (evidenced by many consumers, including myself, switching over to using these more capable lying models). I also think companies will work a bit to reduce reward hacking and blatant lying, and they will probably succeed to some extent (at least for noticeable, everyday problems), in the next few months. That, combined with OpenAI's rollback of 4o sycophancy, will perhaps make it seem like companies are responsive to consumer pressure here. But I think the situation is overall a small-medium update against consumer pressure doing the thing you might hope here. Side point: Noting one other dynamic: advanced models are probably not going to act misaligned in everyday use cases (that consumers have an incentive to care about, though again revealed preference is less clear), even if they're misaligned. That's the whole deceptive alignment thing. So I think it does seem more like the ESG case?

4Tom Davidson10mo

Agree with those updates. Though a small update as I don't think a default gov-led project would be much better on this front. (Though a well designed one led by responsible ppl could be way way better of course.) And I've had a few other convos that made me more worried about race dynamics. Still think two projects is prob better than one overall, but two probbetter than six

3ryan_greenblatt10mo

Agreed, but customers would also presumably be a bit worried that the AI would rarely cross them and steal their stuff or whatever which is somewhat different. Like there wouldn't be a feedback loop toward this where we necessarily see a bunch of early failures, but if we've seen a bunch of cases where scheming powerseeking AIs in the lab execute well crafted misaligned plans, then customers might want an AI which is less likely to do this.

3rosehadshar1y

Thanks, I agree this is an important argument. Two counterpoints: * The more projects you have, the more attempts at alignment you have. It's not obvious to me that more draws are net bad, at least at the margin of 1 to 2 or 3. * I'm more worried about the harms from a misaligned singleton than from a misaligned (or multiple misaligned) systems in a wider ecosystem which includes powerful aligned systems.

0Aaron_Scher1y

While writing, I realized that this sounds a bit similar to the unilateralist's curse. It's not the same, but it has parallels. I'll discuss that briefly because it's relevant to other aspects of the situation. The unilateralist's curse does not occur specifically due to multiple samplings, it occurs because different actors have different beliefs about the value/disvalue, and this variance in beliefs makes it more likely that one of those actors has a belief above the "do it" threshold. If each draw from the AGI urn had the same outcome, this would look a lot like a unilateralist's curse situation where we care about variance in the actors' beliefs. But I instead think that draws from the AGI urn are somewhat independent and the problem is just that we should incur e.g., a 5% misalignment risk as few times as we have to. Interestingly, a similar look at variance is part of what makes the infosecurity situation much worse for multiple projects compared to centralized AGI project: variance is bad here. I expect a single government AGI project to care about and invest in security at least as much as the average AGI company. The AGI companies have some variance in their caring and investment in security, and the lower ones will be easier to steal from. If you assume these multiple projects have similar AGI capabilities (this is a bad assumption but is basically the reason to like multiple projects for Power Concentration reasons so worth assuming here; if the different projects don't have similar capabilities, power is not very balanced), you might then think that any of the companies getting their models stolen is similarly bad to the centralized project getting its models stolen (with a time lag I suppose, because the centralized project got to that level of capability faster). If you are hacking a centralized AGI project, say you have a 50% chance of success. If you are hacking 3 different AGI projects, you have 3 different/independent 50% chances of success. Th

I think this is missing the most important consideration: centralization would likely massively slow down capabilities progress.

6Alexander Gietelink Oldenziel1y

As a point of comparison - do you think the US nuclear programme was substantially slowed down because it was a centralized government programme?

If you mean the Manhattan Project: no. IIUC there were basically zero Western groups and zero dollars working toward the bomb before that, so the Manhattan Project clearly sped things up. That's not really a case of "centralization" so much as doing-the-thing-at-all vs not-doing-the-thing-at-all.

If you mean fusion: yes. There were many fusion projects in the sixties, people were learning quickly. Then the field centralized, and progress slowed to a crawl.

The current boom in fusion energy startups seems to have been set off by deep advances in material sciences (eg. magnets), electronics, manufacturing. These bottlenecks likely were the main reason fusion energy was not possible in the 60s. On priors it is more likely that centralisation was a result rather than a cause of fusion being hard.

On my understanding, the push for centralization came from a specific faction whose pitch was basically:

here's the scaling laws for tokamaks
here's how much money we'd need
... so let's make one real big tokamak rather than spending money on lots of little research devices.

... and that faction mostly won the competition for government funding for about half a century.

The current boom accepted that faction's story at face value, but then noticed that new materials allowed the same "scale up the tokamaks" strategy to be executed on a budget achievable with private funding, and therefore they could fund projects without having to fight the faction which won the battle for government funding.

The counterfactual which I think is probably correct is that there exist entirely different designs far superior to tokamaks, which don't require that much scale in the first place, but which were never discovered because the "scale up the tokamaks" faction basically won the competition for funding and stopped most research on alternative designs from happening.

6Douglas_Knight1y

In fact, many 21st century fusion companies do not use Tokomaks, but use other designs from the 60s. My estimate from wikipedia is about half.

2Douglas_Knight1y

It is a weird claim that the current boom, concentrated in time, is the result of many advances, which were spread out over time. All these advances are being used at the same time because funders are paying for them now and not earlier. How do you know that you need all of those advances and not just some of them? People could have tried using ceramic superconductors in Tokomaks in the 90s, but they didn't, because of centralization. Maybe that wouldn't be enough because you need all the other advances, but it would have yielded more useful data than the actually performed experiments with large Tokomaks.

2Alexander Gietelink Oldenziel1y

The advances build on top of each other. I am not expert in material sciences or magnetmanufacturing but I'd bet a lot of improvements & innovation has been downstream of improved computers & electronics. Neither were available in the 60s.

2Douglas_Knight1y

How is this a response? Yes, advances accumulate over time, which is exactly my point and seems to me to be a rebuttal to the idea that the centralized project has been sane, let alone effective. Which advances do we need? How many do we need? Why is this the magic decade in which we have enough advances, rather than 30 years ago or 30 years hence? In fact, the current boom does not reflect a belief that we have accumulated enough advances that if we combine them all they will work. Instead, there are many different fusion companies trying experiments to harness different advances. They all have different hypotheses and the fact that they are all contemporaries is a coincidence that you fail to explain. If there were a single bottleneck technology that they all use, that would explain it, but I don't think that's true. Computers are a particularly bad explanation because they have continuously improved: they have contributed to everything, but at different times. Surely the reason that they are not trying to combine everything is that it takes time to assimilate advances. Some advances are new and will take time. But some are old and they could have started working on incorporating them decades ago. The failure of the centralized project to do that is extremely damning.

1[comment deleted]1y

3Tom Davidson1y

Thanks! Great point. We do say: But you're completely right that we frame this as a reason that centralisation might not increase the lead on China, and therefore framing it as a point against centralisation. Whereas you're presumably saying that slowing down progress would buy us more time to solve alignment, and so framing it as a significant point for centralisation. I personally don't favour bureaucracy that slows things down and reduce competence in a non-targeted way -- I think competently prioritising work to reduce AI risk during the AI transition will be important. But I think your position is reasonable here

I was starting to draft a very similar post. I was looking through all of the comments on this short form that posed a similar question.

I stopped writing that draft when I saw and thought about this comment:

Something I'm worried about now is some RFK Jr/Dr. Oz equivalent being picked to lead on AI...

That is pretty clearly what would happen if a US-led effort was launched soon. So, I quit weighing the upsides against that huge downside.

It is vaguely possible that Trump could be persuaded to put such a project into responsible hands. One route to do that is working in cooperation with the EU and other allied nations. But Trump isn't likely to cede influence over such an important project as far as that. The US is far, far ahead of its allies, so cutting them in as equal partners seems unlikely.

I was thinking about writing a post called "an apollo project for AGI is a bad idea for the near future" making the above point. But it seems kind of obvious.

Trump will appoint someone who won't get and won't care about thee dangers; they'll YOLO it; we'll die unless alignment turned out to be ridiculously easy. Bad idea.

Now, how to say that in policy debates? I don't know.

6AnthonyC1y

Then you should probably write the post, if you wanted to. It is emphatically not obvious to many other people.

4Seth Herd1y

Thank you! I don't have time; I'm terribly slow at writing posts, and I'm behind on my actual paid work on alignment. If you find this compelling and want to turn it into a top-level post, I'd appreciate it! The fact that this got some disagreement votes means it's probably more debatable than I was initially thinking.

I disagree with some of the claims made here, and I think there several worldview assumptions that go into a lot of these claims. Examples include things like "what do we expect the trajectory to ASI to look like", "how much should we worry about AI takeover risks", "what happens if a single actor ends up controlling [aligned] ASI", "what kinds of regulations can we reasonably expect absent some sort of centralized USG project", and "how much do we expect companies to race to the top on safety absent meaningful USG involvement." (TBC though I don't think i... (read more)

My take - lots of good analysis, but makes a few crucial mistakes/weaknesses that throw the conclusions into significant doubt:

The USG will be able and willing to either provide or mandate strong infosecurity for multiple projects.

I simply don't buy that the infosec for multiple such projects will be anywhere near the infosec of a single project because the overall security ends up being that of the weakest link.

Additionally, the more projects there are with a particular capability, the more folk there are who can leak information either by talking or by b... (read more)

7Tom Davidson1y

Thanks for the pushback! Our worry here isn't that people won't get to enjoy AI benefits for a few years. It's that there will be a massive power imbalance between those with access to AI and those without. And that could have long-term effects

0Chris_Leong1y

I maintain my position that you're missing the stakes if you think that's important. Even limiting ourselves strictly to concentration of power worries, risks of totalitarianism dominate these concerns.

7rosehadshar1y

I think that massive power imbalance (even over short periods) significantly increases the risk of totalitarianism

2Tom Davidson1y

I think massive power imbalance makes it less likely that the post-AGI world is one where many different actors with different beliefs and values can experiment, interact, and reflect. And so I'd expect its long-term future to be worse

1rosehadshar1y

On the infosec thing: "I simply don't buy that the infosec for multiple such projects will be anywhere near the infosec of a single project because the overall security ends up being that of the weakest link." -> nitpick: the important thing isn't how close the infosec for multiple projects is to the infosec of a single project: it's how close the infosec for multiple projects is to something like 'the threshold for good enough infosec, given risk levels and risk tolerance'. That's obviously very non-trivial to work out -> I agree that a single project would probably have higher infosec than multiple projects (though this doesn't seem slam dunk to me and I think it does to you) -> concretely, I currently expect that the USG would be able to provide SL4 and maybe SL5 level infosec to 2-5 projects, not just one. Why do you think this isn't the case? "Additionally, the more projects there are with a particular capability, the more folk there are who can leak information either by talking or by being spies." -> It's not clear to me that a single project would have fewer total people: seems likely that if US AGI development is centralised, it's part of a big beat China push, and involves throwing a lot of money and people at the problem.

Why not just one global project?

3rosehadshar1y

My main take here is that it seems really unlikely that the US and China would agree to work together on this.

2Gurkenglas1y

Zvi's AI newsletter, latest installment https://www.lesswrong.com/posts/LBzRWoTQagRnbPWG4/ai-93-happy-tuesday, has a regular segment Pick Up the Phone arguing against this.

1rosehadshar1y

Thanks! Fwiw I agree with Zvi on "At a minimum, let’s not fire off a starting gun to a race that we might well not win, even if all of humanity wasn’t very likely to lose it, over a ‘missile gap’ style lie that we are somehow not currently in the lead."

2Gurkenglas1y

(You can find his ten mentions of that ~hashtag via the looking glass on thezvi.substack.com. huh, less regular than I thought.)

3AnthonyC1y

In some ways, this would be better if you can get universal buy-in, since there wouldn't be a race for completion. There might be a race for alignment to particular subgroups? Which could be better or worse, depending. Also, securing it against bringing insights and know-how back to a clandestine single-nation competitor seems like it would be very difficult. Like, if we had this kind of project being built, do I really believe there won't be spies telling underground data centers and teams of researchers in Moscow and Washington everything it learns? And that governments will consistently put more effort into the shared project than the secret one?

2Purplehermann1y

Start small, once you have an attractive umbrella working for a few projects you can take in the rest of the US, the the world

1Tom Davidson1y

I think the argument for combining separate US and Chinese projects into one global project is probably stronger than the argument for centralising US development. That's because racing between US companies can potentially be handled by USG regulation, but racing between US and China can't be similarly handled. OTOH, the 'info security' benefits of centralisation mostly wouldn't apply

1Seth Herd1y

That would seem to be better. As long a Putin and similar don't get root access to an AGI as a result.

0Tyler Tracy1y

I like a global project idea more, but I think it still has issues. * A global project would likely eliminate the racing concerns. * A global project would have fewer infosec issues. Hopefully, most state actors who could steal the weights are bought into the project and wouldn't attack it. * Power concentration seems worse since more actors would have varying interests. Some countries would likely have ideological differences and might try to seize power over the project. Various checks and balances might be able to remedy this.

I liked various parts of this post and agree that this is an under-discussed but important topic. I found it a little tricky to understand the information security section. Here are a few disagreements (or possibly just confusions).

A single project might motivate more serious attacks, which are harder to defend against.

It might also motivate earlier attacks, such that the single project would have less total time to get security measures into place.\

In general, I think it's more natural to think about how expensive an attack will be and how har... (read more)

2rosehadshar1y

Thanks for these questions! Earlier attacks: My thinking here is that centralisation might a) cause China to get serious about stealing the weights sooner, and b) therefore allow less time for building up great infosec. So it would be overall bad for infosec. (It's true the models would be weaker, so stealing the weights earlier might not matter so much. But I don't feel very confident that strong infosec would be in place before the models are dangerous (with or without centralisation)) More attack surface: I am trying to compare multiple projects with a single project. The attack surface of a single project might be bigger if the single project itself is very large. As a toy example, imagine 3 labs with 100 employees each. But then USG centralises everything to beat China and pours loads more resources into AGI development. The centralised project has 1000 staff; the counterfactual was 300 staff spread across 3 projects. China stealing weights: sorry, I agree that it's harder for everyone including China, and that all else equal this disincentivises stealing the weights. But a) China is more competent than other actors, so for a fixed increase in difficulty China will be less disincentivised than other actors, b) China has bigger incentives to steal the weights to begin with, and c) for China in particular there might be incentives that push the other way (centralising could increase race dynamics between the US and China, and potentially reduce China's chances of developing AGI first without stealing the weights), and those might counteract the disincentive. Does that make more sense?

Here's a separate comment for a separate point:

I definitely don't find centralization inevitable. I have argued that the US government will very likely take control of AGI projects before they're transformative. But I don't think they'll centralize them. Soft Nationalization: How the US Government Will Control AI Labs lists many legal ways the government could exert pressure and control on AGI labs. I think that still severely underestimates the potential for government control without nationalization. The government can and has simply exerted emergen... (read more)

5rosehadshar1y

"The government can and has simply exerted emergency powers in extreme situations. Developing AGI, properly understood, is definitely an extreme situation. If that were somehow ruled an executive overreach, congress can simply pass new laws." -> How likely do you think it is that there's clear consensus on AGI being an extreme situation/at want point in the trajectory? I definitely agree that If there were consensus the USG would take action. But I'm kind of worried things will be messy and unclear and different groups will have different narratives etc

3Seth Herd1y

I think the question isn't whether but when. AGI most obviously is a huge national security opportunity and risk. The closer we get to it, the more evidence there will be. And the more we talk about it, the more attention will be devoted to it by the national security apparatus. The likely path to takeoff is relatively slow and continuous. People will get to talk to fully human-level entities before they're smart enough to take over. Those people will recognize the potential of a new intelligent species in a visceral way that abstract arguments don't provide.

1rosehadshar1y

That seems overconfident to me, but I hope you're right! To be clear: - I agree that it's obviously a huge natsec opportunity and risk. - I agree the USG will be involved and that things other than nationalization are more likely - I am not confident that there will be consensus across the US on things like 'AGI could lead to an intelligence explosion', 'an intelligence explosion could lead to a single actor taking over the world', 'a single actor taking over the world would be bad'.

2Seth Herd1y

Maybe it's overconfident. I'm not even sure if I hope I'm right. A Trump Executive Branch, or anything close, in charge of AGI seems even worse than Sam Altman or similar setting themselves up as god-emperor. The central premises here are * slow enough takeoff * likely but not certain * Sufficient intelligence in government * Politicians aren't necessarily that smart or forward-looking * National security professionals are. * Visible takeoff- the public is made aware * This does seem more questionable. * But OpenAI is currently full of leaks; keeping human-level AGI secret long enough for it to take over the world before the national security apparatus knows seems really hard. Outside of all that, could there be some sort of comedy of errors or massive collective and individual idiocy that prevents the government from doing its job in a very obvious (in retrospect at least) case? Yeah, it's possible. History, people, and organizations are complex and weird.

Your infosecurity argument seems to involve fixing a point in time, and comparing a (more capable) centralized AI project against multiple (less capable) decentralized AI projects. However, almost all of the risks you're considering depend much more on the capability of the AI project rather than the point in time at which they occur. So I think best practice here would be to fix a rough capability profile, and compare a (shorter timelines) centralized AI project against multiple (longer timelines) decentralized AI projects.

In more detail:

It’s not clear wh

... (read more)

3Tom Davidson1y

Fwiw, my own position is that for both infosec and racing it's the brute fact that USG see fits to centralise all resources and develop AGI asap that would cause China to 1) try much harder to steal the weights than when private companies had developed the same capabilities themselves, 2) try much harder to race to AGI themselves.

7Rohin Shah1y

So the argument here is either that China is more responsive to "social proof" of the importance of AI (rather than observations of AI capabilities), or that China wants to compete with USG for competition's sake (e.g. showing they are as good as or better than USG)? I agree this is plausible. It's a bit weird to me to call this an "incentive", since both of these arguments don't seem to be making any sort of appeal to rational self-interest on China's part. Maybe change it to "motivation"? I think that would have been clearer to me. (Btw, you seem to be assuming that the core reason for centralization will be "beat China", but it could also be "make this technology safe". Presumably this would make a difference to this point as well as others in the post.)

1rosehadshar1y

Changed to motivation, thanks for the suggestion. I agree that centralising to make AI safe would make a difference. It seems a lot less likely to me than centralising to beat China (there's already loads of beat China rhetoric, and it doesn't seem very likely to go away).

3rosehadshar1y

Thanks, I expect you're right that there's some confusion in my thinking here. Haven't got to the bottom of it yet, but on more incentive to steal the weights: - partly I'm reasoning in the way that you guess, more resources -> more capabilities -> more incentives - I'm also thinking "stronger signal that the US is all in and thinks this is really important -> raises p(China should also be all in) from a Chinese perspective -> more likely China invests hard in stealing the weights" - these aren't independent lines of reasoning, as the stronger signal is sent by spending more resources - but I tentatively think that it's not the case that at a fixed capability level the incentives to steal the weights are the same. I think they'd be higher with a centralised project, as conditional on a centralised project there's more reason for China to believe a) AGI is the one thing that matters, b) the US is out to dominate

1Rohin Shah1y

(Replied to Tom above)

2Orpheus161y

While a centralized project would get more resources, it also has more ability to pause//investigate things. So EG if the centralized project researchers see something concerning (perhaps early warning signs of scheming), it seems more able to do a bunch of research on that Concerning Thing before advancing. I think makes the effect of centralization on timelines unclear, at least if one expects these kinds of “warning signs” on the pathway to very advanced capabilities. (It’s plausible that you could get something like this from a decentralized model with sufficient oversight, but this seems much harder, especially if we expect most of the technical talent to stay with the decentralized projects as opposed to joining the oversight body.)

4Rohin Shah1y

Tbc, I don't want to strongly claim that centralization implies shorter timelines. Besides the point you raise there's also things like bureaucracy and diseconomies of scale. I'm just trying to figure out what the authors of the post were saying. That said, if I had to guess, I'd guess that centralization speeds up timelines.

Strong upvote - well laid out, clear explanation of your position and reasoning, I learned things.

Overall I think the lines of thought all make sense, but they seem to me to hinge entirely on your assigning a low probability to AI takeover scenarios, which you point out you have not modeled. I mean this in the sense that power concentration risks, as described, are only meaningful in scenarios where the power resides with the humans that create the AI, rather than the AI. Relatedly, the only way power concentration risks are lower in the non-centralization... (read more)

4rosehadshar1y

Thanks! I think I don't follow everything you're saying in this comment; sorry. A few things: - We do have lower p(AI takeover) than lots of folks - and higher than lots of other folks. But I think even if your p(AI takeover) is much higher, it's unclear that centralisation is good, for some of the reasons we give in the post: -- race dynamics with China might get worse and increase AI takeover risk -- racing between western projects might not be a big deal in comparison, because of races to the top and being more easily able to regulate - I'm not trying to assume that China couldn't catch up to the US. I think it's plausible that China could do this in either world via stealing the model weights, or if timelines are long. Maybe it could also catch up without those things if it put its whole industrial might behind the problem (which it might be motivated to do in the wake of US centralisation). - I think whether a human dictatorship is better or worse than an AI dictatorship isn't obvious (and that some dictatorships could be worse than extinction)

I think an important consideration being overlooked is how comptetntly a centralised project would actually be managed.

In one of your charts, you suggest worlds where there is a single project will make progress faster due to "speedup from compute almagamation". This is not necessarily true. It's very possible that different teams would be able to make progress at very different rates even if both given identical compute resources.

At a boots-on-the-ground level, the speed of progress an AI project makes will be influenced by thosands of tiny decisions abou... (read more)

1rosehadshar1y

I agree that it's not necessarily true that centralising would speed up US development! (I don't think we overlook this: we say "The US might slow down for other reasons. It’s not clear how the speedup from compute amalgamation nets out with other factors which might slow the US down: * Bureaucracy. A centralised project would probably be more bureaucratic. * Reduced innovation. Reducing the number of projects could reduce innovation.") Interesting take that it's more likely to slow things down than speed things up. I tentatively agree, but I haven't thought deeply about just how much more compute a central project would have access to, and could imagine changing my mind if it were lots more.

Regulation to reduce racing. Government regulation could temper racing between multiple western projects. So there are ways to reduce racing between western projects, besides centralising.

Can you say more about the kinds of regulations you're envisioning? What are your favorite ideas for regulations for (a) the current Overton Window and (b) a wider Overton Window but one that still has some constraints?

1rosehadshar1y

This is a good question and I haven't thought much about it. (Tom might have better ideas.) My quick takes: - The usual stuff: compute thresholds, eval requirements, transparency, infosec requirements, maybe licensing beyond a certain risk threshold - Maybe important to mandate use of certain types of hardware, if we get verification mechanisms which enable agreements with China

There’s no incentive for the project to sell its most advanced systems to keep up with the competition.

I found myself a bit skeptical about the economic picture laid out in this post. Currently, because there are many comparably good AI models, the price for users is driven down to near, or sometimes below (in the case of free-tier access) marginal inference costs. As such, there is somewhat less money to be made in selling access to AI services, and companies not right at the frontier, e.g. Meta, choose to make their models open weight, as probably they c... (read more)

2rosehadshar1y

I think I still believe the thing we initially wrote: * Agree with you that there might be strong incentives to sell stuff at monopoloy prices (and I'm worried about this). But if there's a big gap, you can do this without selling your most advanced models. (You sell access to weaker models for a big mark up, and keep the most advanced ones to yourselves to help you further entrench your monopoly/your edge over any and all other actors.) * I'm sceptical of worlds where 5 similarly advanced AGI projects don't bother to sell * Presumably any one of those could defect at any time and sell at a decent price. Why doesn't this happen? * Eventually they need to start making revenue, right? They can't just exist on investment forever (I am also not an economist though and interested in pushback.)

1Oscar1y

I think I agree with your original statement now. It still feels slightly misleading though, as while 'keeping up with the competition' won't provide the motivation (as there putatively is no competition), there will still be strong incentives to sell at any capability level. (And as you say this may be overcome by an even stronger incentive to hoard frontier intelligence for their own R&D and strategising use. But this outweighs rather than annuls the direct economic incentive to make a packet of money by selling access to your latest system.)

1Oscar1y

I agree the '5 projects but no selling AI services' world is moderately unlikely, the toy version of it I have in mind is something like: * It costs $10 million to set up a misuse monitoring team, API infrastructure and help manuals, a web interface, etc in up-front costs to start selling access to your AI model. * If you are the only company to do this, you make $100 million at monopoly prices. * But if multiple companies do this, the price gets driven down to marginal inference costs, and you make ~$0 in profits and just lose the initial $10 million in fixed costs. * So all the companies would prefer to be the only one selling, but second-best is for no-one to sell, and worst is for multiple companies to sell. * Even without explicit collusion, they could all realise it is not worth selling (but worth punishing anyone who defects). This seems unlikely to me because: * Maybe the up-front costs of at least a kind of scrappy version are actually low. * Consumers lack information nd aren't fully rational, so the first company to start selling would have an advantage (OpenAI with ChatGPT in this case, even after Claude became as good or better). * Empirically, we don't tend to see an equilibrium of no company offering a service that it would be profitable for one company to offer. So actually maybe it is sufficiently unlikely not to bother with much. There seems to be some slim theoretical world where it happens though.

1rosehadshar1y

"it is potentially a lot easier to stop a single project than to stop many projects simultaneously" -> agree.

I just skimmed but just wanted to flag that I like Bengio's proposal of one coordinated coalition that develops several AGIs in a coordinated fashion (e.g. training runs at the same time on their own clusters), which decreases the main downside of having one single AGI project (power concentration).

0rosehadshar1y

Thanks, this seems cool and I hadn't seen it.

I think it is much less clear that pluralism is good than you portray. I would not, for example, want other weapons of mass destruction to be pluralized.

1rosehadshar1y

I also don't want that! I think something more like: * Pluralism is good for reducing power concentration, and maybe for AI safety (as you get more shots on goal) * There are probably some technologies that you really don't want widely shared though * The question is whether it's possible to restrict these technologies via regulation and infosecurity, without restricting the number of projects or access to other safe technologies Note also that it's not clear what the offence-defence balance will be like. Maybe we will be lucky, and defence-dominant tech will get developed first. Maybe we will get unlucky, and need to restrict offense-dominant tech (either until we develop defensive tech, or permanently). We need to be prepared for both eventualities, but it's not yet clear how big a problem this will end up being.

One idea that seems potentially promising is to have a single centralised project and minimize the chance it becomes too powerful by minimizing its ability to take actions in the broader world.

Concretely, a ‘Pre-Training Project’ does pre-training and GCR safety assessment, post-training needed for the above activities (including post-training to make AI R&D agents and evaluating the safety of post-training techniques), and nothing else. And then have many (>5) companies that do fine-tuning, scaffolding, productising, selling API access,... (read more)

I don't think having just one Western AGI project would work. It would fail for the same reasons the Soviet Union collapsed. Planned economies, also known as command economies, can fail for a number of reasons, including inefficient resource distribution, lack of compettion, and inflexible planning. These shortcomings limit innovation and adaptability. The collapse of the Soviet Union is often seen as a failure of the planned economy concept. In the same way, centralizing AGI development in a single project would only stifle innovation and create vulnerabilities that could hinder progress.

In my work I aggregate multiple other systems' work as well as doing my own.

I think a similar approach may be useful. Create standardized outputs each project has to send to the overarching org, allow each to develop their own capabilities and to a degree how what is required to make those outputs meaningfully reflect on the capabilities and R&D of the project.

This will lay the ground to self-regulate, keeps most of the power with the org (assuming it is itself good at actual research and creation) conditional on the org playing nice and being upstanding with the contributing members, and without limiting any project before it is necessary.

It would for national security reasons be strange to assume that there already now is no coordination among the US firms. And... are we really sure that China is behind in the AGI race?

3Tom Davidson1y

5Aaron_Scher1y

2Tom Davidson1y

1Aaron_Scher1y

1Tom Davidson1y

4Aaron_Scher10mo

4Tom Davidson10mo

3ryan_greenblatt10mo

3rosehadshar1y

0Aaron_Scher1y

I think this is missing the most important consideration: centralization would likely massively slow down capabilities progress.

6Alexander Gietelink Oldenziel1y

As a point of comparison - do you think the US nuclear programme was substantially slowed down because it was a centralized government programme?

If you mean fusion: yes. There were many fusion projects in the sixties, people were learning quickly. Then the field centralized, and progress slowed to a crawl.

On my understanding, the push for centralization came from a specific faction whose pitch was basically:

here's the scaling laws for tokamaks
here's how much money we'd need
... so let's make one real big tokamak rather than spending money on lots of little research devices.

... and that faction mostly won the competition for government funding for about half a century.

6Douglas_Knight1y

In fact, many 21st century fusion companies do not use Tokomaks, but use other designs from the 60s. My estimate from wikipedia is about half.

2Douglas_Knight1y

2Alexander Gietelink Oldenziel1y

2Douglas_Knight1y

1[comment deleted]1y

3Tom Davidson1y

I was starting to draft a very similar post. I was looking through all of the comments on this short form that posed a similar question.

I stopped writing that draft when I saw and thought about this comment:

Something I'm worried about now is some RFK Jr/Dr. Oz equivalent being picked to lead on AI...

That is pretty clearly what would happen if a US-led effort was launched soon. So, I quit weighing the upsides against that huge downside.

I was thinking about writing a post called "an apollo project for AGI is a bad idea for the near future" making the above point. But it seems kind of obvious.

Trump will appoint someone who won't get and won't care about thee dangers; they'll YOLO it; we'll die unless alignment turned out to be ridiculously easy. Bad idea.

Now, how to say that in policy debates? I don't know.

6AnthonyC1y

Then you should probably write the post, if you wanted to. It is emphatically not obvious to many other people.

4Seth Herd1y

My take - lots of good analysis, but makes a few crucial mistakes/weaknesses that throw the conclusions into significant doubt:

The USG will be able and willing to either provide or mandate strong infosecurity for multiple projects.

I simply don't buy that the infosec for multiple such projects will be anywhere near the infosec of a single project because the overall security ends up being that of the weakest link.

Additionally, the more projects there are with a particular capability, the more folk there are who can leak information either by talking or by b... (read more)

7Tom Davidson1y

0Chris_Leong1y

7rosehadshar1y

I think that massive power imbalance (even over short periods) significantly increases the risk of totalitarianism

2Tom Davidson1y

1rosehadshar1y

Why not just one global project?

3rosehadshar1y

My main take here is that it seems really unlikely that the US and China would agree to work together on this.

2Gurkenglas1y

Zvi's AI newsletter, latest installment https://www.lesswrong.com/posts/LBzRWoTQagRnbPWG4/ai-93-happy-tuesday, has a regular segment Pick Up the Phone arguing against this.

1rosehadshar1y

2Gurkenglas1y

(You can find his ten mentions of that ~hashtag via the looking glass on thezvi.substack.com. huh, less regular than I thought.)

3AnthonyC1y

2Purplehermann1y

Start small, once you have an attractive umbrella working for a few projects you can take in the rest of the US, the the world

1Tom Davidson1y

1Seth Herd1y

That would seem to be better. As long a Putin and similar don't get root access to an AGI as a result.

0Tyler Tracy1y

A single project might motivate more serious attacks, which are harder to defend against.

It might also motivate earlier attacks, such that the single project would have less total time to get security measures into place.\

In general, I think it's more natural to think about how expensive an attack will be and how har... (read more)

2rosehadshar1y

Here's a separate comment for a separate point:

5rosehadshar1y

3Seth Herd1y

1rosehadshar1y

2Seth Herd1y

In more detail:

It’s not clear wh

... (read more)

3Tom Davidson1y

7Rohin Shah1y

1rosehadshar1y

3rosehadshar1y

1Rohin Shah1y

(Replied to Tom above)

2Orpheus161y

4Rohin Shah1y

Strong upvote - well laid out, clear explanation of your position and reasoning, I learned things.

4rosehadshar1y

1rosehadshar1y

Regulation to reduce racing. Government regulation could temper racing between multiple western projects. So there are ways to reduce racing between western projects, besides centralising.

1rosehadshar1y

There’s no incentive for the project to sell its most advanced systems to keep up with the competition.

2rosehadshar1y

1Oscar1y

1rosehadshar1y

"it is potentially a lot easier to stop a single project than to stop many projects simultaneously" -> agree.

0rosehadshar1y

Thanks, this seems cool and I hadn't seen it.

I think it is much less clear that pluralism is good than you portray. I would not, for example, want other weapons of mass destruction to be pluralized.

1rosehadshar1y

One idea that seems potentially promising is to have a single centralised project and minimize the chance it becomes too powerful by minimizing its ability to take actions in the broader world.

In my work I aggregate multiple other systems' work as well as doing my own.

It would for national security reasons be strange to assume that there already now is no coordination among the US firms. And... are we really sure that China is behind in the AGI race?

LESSWRONG
LW

LESSWRONG
LW

78

Should there be just one western AGI project?

78

What are the strategic implications of having one instead of several projects?

Summary table

Race dynamics

Racing between western projects

Racing between the US and China

Why do race dynamics matter?

Power concentration

Why does power concentration matter?

Infosecurity

Why does infosecurity matter?

What is the best path forwards, given that strategic landscape?

Our overall take

Our current best guess

Overall conclusion

Appendix: Why we don’t think centralisation is inevitable

78

78