This is a linkpost for https://youtu.be/D5rEMNyfIWw
My first post should be validated soon, and is a proof that the strong form does not hold: in some games some terminal alignment perform less than non-terminal equivalent alignment.
An hypothesis is that most goals, if they become “terminal” (“in itself”, impervious to change), prevent evolution, and mutualistic relationships with other agents.
On my side podcast, "The Filan Cabinet", I invited Ronny Fernandez and Divia Eden to talk about the strong orthogonality thesis, and whether it's true. Seems like people here might also be interested. Podcast description below, and you can listen here.
In this episode, Divia Eden and Ronny Fernandez talk about the (strong) orthogonality thesis - that arbitrarily smart intelligences can be paired with arbitrary goals, without additional complication beyond that of specifying the goal - with light prompting from me. Topics they touch on include: