Some questions that I have about AI and the overall strategic situation, and why I want to know:
Will Automating AI R&D not work for some reason, or will it not lead to vastly superhuman superintelligence within 2 years of "~100% automation" for some reason?
* Why I want to know:
* Naively, it seems like at (or before) the point when AI models are capable of doing AI research at a human-level, we should see a self-reinforcing speedup in AI progress. So AI systems that are substantially superhuman should arrive not long after “human-researcher-level-AI” in calendar time, on the default trajectory.
* That an intelligence explosion is possible / likely imposes a major constraint on both technical alignment efforts and policy pushes, because it means that a company might develop dangerously superhuman AI relatively suddenly, and that that AI may have design properties that the human researchers at that company don't understand.
* If I knew that ~as-capable-as-elite-humans AI doesn't lead to an intelligence explosion for some reason, would I do anything different?
* Well, I wouldn't feel like warning the government about the possibility of an intelligence explosion is an urgent priority.
* I would assign much less mass to an acute takeover event in the near term. Without the acceleration dynamics of an intelligence explosion, I don’t think that any one company, or any one AI, would attain a substantial lead over the others.
* In that case, it seems like our main concerns are gradual disempowerment and gradual disempowerment followed by an abrupt AI coup.
* I haven’t yet seen a good argument for why automating AI R&D wouldn’t lead to a substantial and self-reinforcing speed up in AI progress leading to a steep climb up to superintelligence.
* Notes:
* The strongest reason that occurs to me:
* A conjunction:
* LLMs are much further from full general intelligences than they currently seem. They’ll get increasingly good at e