Akash

Sequences

Leveling Up: advice & resources for junior alignment researchers

Wiki Contributions

Comments

Sorted by
Akash20

it's less clear that a non-centralized situation inevitably leads to a decisive strategic advantage for the leading project

Can you say more about what has contributed to this update?

Akash42

Can you say more about scenarios where you envision a later project happening that has different motivations?

I think in the current zeitgeist, such a project would almost definitely be primarily motivated by beating China. It doesn't seem clear to me that it's good to wait for a new zeitgeist. Reasons:

  • A company might develop AGI (or an AI system that is very good at AI R&D that can get to AGI) before a major zeitgeist change.
  • The longer we wait, the more capable the "most capable model that wasn't secured" is. So we could risk getting into a scenario where people want to pause but since China and the US both have GPT-Nminus1, both sides feel compelled to race forward (whereas this wouldn't have happened if security had kicked off sooner.)
Akash40

If you could only have "partial visibility", what are some of the things you would most want the government to be able to know?

Akash42

Another frame: If alignment turns out to be easy, then the default trajectory seems fine (at least from an alignment POV. You might still be worried about EG concentration of power). 

If alignment turns out to be hard, then the policy decisions we make to affect the default trajectory matter a lot more.

This means that even if misalignment risks are relatively low, a lot of value still comes from thinking about worlds where misalignment is hard (or perhaps "somewhat hard but not intractably hard").

Akash50

What do you think are the most important factors for determining if it results in them behaving responsibly later? 

For instance, if you were in charge of designing the AI Manhattan Project, are there certain things you would do to try to increase the probability that it leads to the USG "behaving more responsibly later?"

Akash42

Good points. Suppose you were on a USG taskforce that had concluded they wanted to go with the "subsidy model", but they were willing to ask for certain concessions from industry.

Are there any concessions/arrangements that you would advocate for? Are there any ways to do the "subsidy model" well, or do you think the model is destined to fail even if there were a lot of flexibility RE how to implement it?

Akash34-3

My own impression is that this would be an improvement over the status quo. Main reasons:

  • A lot of my P(doom) comes from race dynamics.
  • Right now, if a leading lab ends up realizing that misalignment risks are super concerning, they can't do much to end the race. Their main strategy would be to go to the USG.
  • If the USG runs the Manhattan Project (or there's some sort of soft nationalization in which the government ends up having a much stronger role), it's much easier for the USG to see that misalignment risks are concerning & to do something about it.
  • A national project would be more able to slow down and pursue various kinds of international agreements (the national project has more access to POTUS, DoD, NSC, Congress, etc.)
  • I expect the USG to be stricter on various security standards. It seems more likely to me that the USG would EG demand a lot of security requirements to prevent model weights or algorithmic insights from leaking to China. One of my major concerns is that people will want to pause at GPT-X but they won't feel able to because China stole access to GPT-Xminus1 (or maybe even a slightly weaker version of GPT-X).
  • In general, I feel like USG natsec folks are less "move fast and break things" than folks in SF. While I do think some of the AGI companies have tried to be less "move fast and break things" than the average company, I think corporate race dynamics & the general cultural forces have been the dominant factors and undermined a lot of attempts at meaningful corporate governance.

(Caveat that even though I see this as a likely improvement over status quo, this doesn't mean I think this is the best thing to be advocating for.)

(Second caveat that I haven't thought about this particular question very much and I could definitely be wrong & see a lot of reasonable counterarguments.)

Akash40

@davekasten @Zvi @habryka @Rob Bensinger @ryan_greenblatt @Buck @tlevin @Richard_Ngo @Daniel Kokotajlo I suspect you might have interesting thoughts on this. (Feel free to ignore though.)

Akash360

Suppose the US government pursued a "Manhattan Project for AGI". At its onset, it's primarily fuelled by a desire to beat China to AGI. However, there's some chance that its motivation shifts over time (e.g., if the government ends up thinking that misalignment risks are a big deal, its approach to AGI might change.)

Do you think this would be (a) better than the current situation, (b) worse than the current situation, or (c) it depends on XYZ factors?

Akash74

We're not going to be bottlenecked by politicians not caring about AI safety. As AI gets crazier and crazier everyone would want to do AI safety, and the question is guiding people to the right AI safety policies

I think we're seeing more interest in AI, but I think interest in "AI in general" and "AI through the lens of great power competition with China" has vastly outpaced interest in "AI safety". (Especially if we're using a narrow definition of AI safety; note that people in DC often use the term "AI safety" to refer to a much broader set of concerns than AGI safety/misalignment concerns.)

I do think there's some truth to the quote (we are seeing more interest in AI and some safety topics), but I think there's still a lot to do to increase the salience of AI safety (and in particular AGI alignment) concerns.

Load More