You are Underestimating The Likelihood That Convergent Instrumental Subgoals Lead to Aligned AGI
This post is an argument for the Future Fund's "AI Worldview" prize. Namely, I claim that the estimates given for the following probability are too high: > P(misalignment x-risk|AGI)”: Conditional on AGI being developed by 2070, humanity will go extinct or drastically curtail its future potential due to loss of...
Fine, replace the agents with rocks. The problem still holds.
There's no closed form solution for the 3-body problem; you can only numerically approximate the future, with decreasing accuracy as time goes on. There are far more than 3 bodies in the universe relevant to the long term survival of an AGI that could die in any number of ways because it's made of many complex pieces that can all break or fail.