These aren't necessarily milestones rather than capabilites that can come on a sliding scale, but:
Tools to accelerate alignment research (which are also tools to accelerate AGI research)
Value learning schemes at various stages along both conceptual and technological development
That said, I'm not sure how much governance plans should adapt based on milestones. Maybe we should expect governance to be slow to respond, and therefore requiring plans that are good when applied broadly and without much awareness of context.
How is that relevant? It's about whether AI risk will be mainstream. I'm thinking about governance interventions by this community, which doesn't require the rest of the world to appreciate AI risk.
Optimal governance interventions depend on progress in technical AI safety. For example, two rough technical safety milestones are have metrics to determine how scary a model's capabilities are and have a tool to determine whether a model is trying to deceive you. Our governance plans should adapt based on whether these milestones have been achieved (or when it seems they will be achieved) (and for less binary milestones, adapt based on partial progress).
What are more possible milestones in technical AI safety (that might be relevant to governance interventions)?