I agree that centralising to make AI safe would make a difference. It seems a lot less likely to me than centralising to beat China (there's already loads of beat China rhetoric, and it doesn't seem very likely to go away).

Reply

Should there be just one western AGI project?

rosehadshar4mo10

"it is potentially a lot easier to stop a single project than to stop many projects simultaneously" -> agree.

Reply

Should there be just one western AGI project?

rosehadshar4mo20

I think I still believe the thing we initially wrote:

Agree with you that there might be strong incentives to sell stuff at monopoloy prices (and I'm worried about this). But if there's a big gap, you can do this without selling your most advanced models. (You sell access to weaker models for a big mark up, and keep the most advanced ones to yourselves to help you further entrench your monopoly/your edge over any and all other actors.)
I'm sceptical of worlds where 5 similarly advanced AGI projects don't bother to sell
- Presumably any one of those could defect at any time and sell at a decent price. Why doesn't this happen?
- Eventually they need to start making revenue, right? They can't just exist on investment forever
  (I am also not an economist though and interested in pushback.)

Reply

Should there be just one western AGI project?

rosehadshar4mo30

Thanks, I expect you're right that there's some confusion in my thinking here.

Haven't got to the bottom of it yet, but on more incentive to steal the weights:
- partly I'm reasoning in the way that you guess, more resources -> more capabilities -> more incentives
- I'm also thinking "stronger signal that the US is all in and thinks this is really important -> raises p(China should also be all in) from a Chinese perspective -> more likely China invests hard in stealing the weights"
- these aren't independent lines of reasoning, as the stronger signal is sent by spending more resources
- but I tentatively think that it's not the case that at a fixed capability level the incentives to steal the weights are the same. I think they'd be higher with a centralised project, as conditional on a centralised project there's more reason for China to believe a) AGI is the one thing that matters, b) the US is out to dominate

Reply

Should there be just one western AGI project?

rosehadshar5mo30

Thanks, I agree this is an important argument.

Two counterpoints:

The more projects you have, the more attempts at alignment you have. It's not obvious to me that more draws are net bad, at least at the margin of 1 to 2 or 3.
I'm more worried about the harms from a misaligned singleton than from a misaligned (or multiple misaligned) systems in a wider ecosystem which includes powerful aligned systems.

Reply

Should there be just one western AGI project?

rosehadshar5mo10

Thanks! Fwiw I agree with Zvi on "At a minimum, let’s not fire off a starting gun to a race that we might well not win, even if all of humanity wasn’t very likely to lose it, over a ‘missile gap’ style lie that we are somehow not currently in the lead."

Reply

Should there be just one western AGI project?

rosehadshar5mo21

Thanks for these questions!

Earlier attacks: My thinking here is that centralisation might a) cause China to get serious about stealing the weights sooner, and b) therefore allow less time for building up great infosec. So it would be overall bad for infosec. (It's true the models would be weaker, so stealing the weights earlier might not matter so much. But I don't feel very confident that strong infosec would be in place before the models are dangerous (with or without centralisation))

More attack surface: I am trying to compare multiple projects with a single project. The attack surface of a single project might be bigger if the single project itself is very large. As a toy example, imagine 3 labs with 100 employees each. But then USG centralises everything to beat China and pours loads more resources into AGI development. The centralised project has 1000 staff; the counterfactual was 300 staff spread across 3 projects.

China stealing weights: sorry, I agree that it's harder for everyone including China, and that all else equal this disincentivises stealing the weights. But a) China is more competent than other actors, so for a fixed increase in difficulty China will be less disincentivised than other actors, b) China has bigger incentives to steal the weights to begin with, and c) for China in particular there might be incentives that push the other way (centralising could increase race dynamics between the US and China, and potentially reduce China's chances of developing AGI first without stealing the weights), and those might counteract the disincentive. Does that make more sense?

Reply

Should there be just one western AGI project?

rosehadshar5mo30

My main take here is that it seems really unlikely that the US and China would agree to work together on this.

Reply

Should there be just one western AGI project?

rosehadshar5mo10

That seems overconfident to me, but I hope you're right!

To be clear:
- I agree that it's obviously a huge natsec opportunity and risk.
- I agree the USG will be involved and that things other than nationalization are more likely
- I am not confident that there will be consensus across the US on things like 'AGI could lead to an intelligence explosion', 'an intelligence explosion could lead to a single actor taking over the world', 'a single actor taking over the world would be bad'.

Reply

Should there be just one western AGI project?

rosehadshar5mo41

Thanks!

I think I don't follow everything you're saying in this comment; sorry. A few things:
- We do have lower p(AI takeover) than lots of folks - and higher than lots of other folks. But I think even if your p(AI takeover) is much higher, it's unclear that centralisation is good, for some of the reasons we give in the post:
-- race dynamics with China might get worse and increase AI takeover risk
-- racing between western projects might not be a big deal in comparison, because of races to the top and being more easily able to regulate
- I'm not trying to assume that China couldn't catch up to the US. I think it's plausible that China could do this in either world via stealing the model weights, or if timelines are long. Maybe it could also catch up without those things if it put its whole industrial might behind the problem (which it might be motivated to do in the wake of US centralisation).
- I think whether a human dictatorship is better or worse than an AI dictatorship isn't obvious (and that some dictatorships could be worse than extinction)

Reply