What’s the short timeline plan?
This is a low-effort post (at least, it was intended as such ...). I mostly want to get other people’s takes and express concern about the lack of detailed and publicly available plans so far. This post reflects my personal opinion and not necessarily that of other members of Apollo Research. I’d like to thank Ryan Greenblatt, Bronson Schoen, Josh Clymer, Buck Shlegeris, Dan Braun, Mikita Balesni, Jérémy Scheurer, and Cody Rushing for comments and discussion. I think short timelines, e.g. AIs that can replace a top researcher at an AGI lab without losses in capabilities by 2027, are plausible. Some people have posted ideas on what a reasonable plan to reduce AI risk for such timelines might look like (e.g. Sam Bowman’s checklist, or Holden Karnofsky’s list in his 2022 nearcast), but I find them insufficient for the magnitude of the stakes (to be clear, I don’t think these example lists were intended to be an extensive plan). If we take AGI seriously, I feel like the AGI companies and the rest of the world should be significantly more prepared, and I think we’re now getting into the territory where models are capable enough that acting without a clear plan is irresponsible. In this post, I want to ask what such a short timeline plan could look like. Intuitively, if an AGI lab came to me today and told me, “We really fully believe that we will build AGI by 2027, and we will enact your plan, but we aren’t willing to take more than a 3-month delay,” I want to be able to give the best possible answer. I list some suggestions but I don’t think they are anywhere near sufficient. I’d love to see more people provide their answers. If a funder is interested in funding this, I’d also love to see some sort of “best short-timeline plan prize” where people can win money for the best plan as judged by an expert panel. In particular, I think the AGI companies should publish their detailed plans (minus secret information) so that governments, academics, and civil society can c
I think it's plausible that good monitors will make consumer applications of AI more capable and effective. In some sense, safety is a relevant blocker for parts of that at the moment.
Though, I think it is quite unlikely to push the frontier and I think the negative externalities of non-lab developers being faster at coding are very small. On average, it just seems to increase productivity.
I'd also expect that the monitors we build and are not directly targeted at making frontier AIs more effective, don't happen to be more effective at that then the 100s of employees who push the boundaries of the frontier full-time.
So on balance, I think the risk is pretty low and the benefits are high. This was one of the considerations we thought through in depth before making the decision to make monitors.