Twitter thread on open-source AI

Richard_Ngo

This is a linkpost for https://x.com/RichardMCNgo/status/1813673154851824078

Some thoughts on open-source AI (copied over from a recent twitter thread):

1. We should have a strong prior favoring open source. It’s been a huge success driving tech progress over many decades. We forget how counterintuitive it was originally, and shouldn’t take it for granted.

2. Open source has also been very valuable for alignment. It’s key to progress on interpretability, as outlined here.

3. I am concerned, however, that offense will heavily outpace defense in the long term. As AI accelerates science, many new WMDs will emerge. Even if defense of infrastructure keeps up with offense, human bodies are a roughly fixed and very vulnerable attack surface.

4. A central concern about open source AI: it’ll allow terrorists to build bioweapons. This shouldn’t be dismissed, but IMO it’s easy to be disproportionately scared of terrorism. More central risks are eg “North Korea becomes capable of killing billions”, which they aren’t now.

5. Another worry: misaligned open-source models will go rogue and autonomously spread across the internet. Rogue AIs are a real concern, but they wouldn’t gain much power via this strategy. We should worry more about power grabs from AIs deployed inside influential institutions.

6. In my ideal world, open source would lag a year or two behind the frontier, so that the world has a chance to evaluate and prepare for big risks before a free-for-all starts. But that’s the status quo! So I expect the main action will continue to be with closed-source models.

7. If open-source seems like it’ll catch up to or surpass closed source models, then I’d favor mandating a “responsible disclosure” period (analogous to cybersecurity) that lets people incorporate the model into their defenses (maybe via API?) before the weights are released.

8. I got this idea from Sam Marks. Though unlike him I think the process should have a fixed length, since it’d be easy for it to get bogged down in red tape and special interests otherwise.

9. Almost everyone agrees that we should be very careful about models which can design new WMDs. The current fights are mostly about how many procedural constraints we should lock in now, reflecting a breakdown of trust between AI safety people and accelerationists.

10. Ultimately the future of open source will depend on how the US NatSec apparatus orients to superhuman AIs. This requires nuanced thinking: no worldview as simple as “release everything”, “shut down everything”, or “defeat China at all costs” will survive contact with reality.

11. Lastly, AIs will soon be crucial extensions of human agency, and eventually moral patients in their own right. We should aim to identify principles for a shared digital-biological world as far-sighted and wise as those in the US constitution. Here's a start.

12. One more meta-level point: I’ve talked to many people on all sides of this issue, and have generally found them to be very thoughtful and genuine (with the exception of a few very online outliers on both sides). There’s more common ground here than most people think.

To clarify, were you saying here that we should have a strong prior that open source will progress quickly?

It reads like you're saying that we should have a prior that it's "good" which seems silly given that the reasons it was good before was that it lead to quicker progress, and the reasons it may start to be bad very soon is also that it will lead to quicker progress.

There was never any evidence one way or the other that open source code is better at avoiding being misapplied by state actors, terrorists, or generally irresponsible experimental projects, but it's reasonable to guess for now that it's not very good avoiding that.

It's not obvious that open source leads to faster progress. Having high quality open source products reduces the incentives for private investment. I'm not sure in which regimes that will play out that it's overall accelerationist, but I sort of guess that it will be decelerationist during an intense AI race (where the investments needed to push the frontier out are enormous and significantly profit-motivated).

Okay yeah, I meant quicker progress in expectation, I don't believe that people today are capable of the level of coordination under which privatizing science could lead to faster progress in science.

But if we're talking about mixed regimes, that's a different question. Are we? Some do complain of a tilt towards a regime where frontier models will only be had by the private sphere, but it seems unlikely to happen.

1. We should have a strong prior favoring open source. It’s been a huge success driving tech progress over many decades. We forget how counterintuitive it was originally, and shouldn’t take it for granted.

We should generally have a strong prior favoring technology in general, but, once we've said "actually this time it's different (for [Reasons])", is there a particular reason to preserve the prior favoring open source?

We should generally have a strong prior favoring technology in general

Should we? I think it's much more obvious that the increase in human welfare so far has mostly been caused by technology, than that most technologies have net helped humans (much less organisms generally).

I'm quite grateful for agriculture now, but unsure I would have been during the Bronze Age; grateful for nuclear weapons, but unsure how many nearby worlds I'd feel similarly; net bummed about machine guns, etc.

Well, the prior should weaken the extent to which we believe any given set of reasons for why it's different this time.

To clarify, were you saying here that we should have a strong prior that open source will progress quickly?

1. We should have a strong prior favoring open source. It’s been a huge success driving tech progress over many decades. We forget how counterintuitive it was originally, and shouldn’t take it for granted.

We should generally have a strong prior favoring technology in general

Well, the prior should weaken the extent to which we believe any given set of reasons for why it's different this time.

LESSWRONG
LW

LESSWRONG
LW

33

Twitter thread on open-source AI

33

33

33