I am after papers that you have stumbled across that may be relevant to the core of the understanding the existential AI alignment problem despite not having explicit links to AI alignment. 

Here are papers I have found that fit the above criteria to give you a better idea of what I'm after:

While I've used the term "agent foundations" I expect the majority of useful papers will not use terms like agency, optimization etc. 

New Answer
New Comment

4 Answers sorted by

rorygreig

80

An interesting paper is The information theory of individuality, Krakauer et. al
 

martin biehl

40

Interpreting Systems as Solving POMDPs: A Step Towards a Formal Understanding of Agency

https://link.springer.com/chapter/10.1007/978-3-031-28719-0_2

Roman Leventov

20

"General cognitive science": Boyd et al. (2022), Goyal & Bengio (2022), Levin (2022), Fields et al. (2022), Friston et al. (2022), Ma et al. (2022), LeCun (2022) (links from here).

Also, general theories of cognitive development, e.g., Kuchling et al. 2022; Fields et al. 2022.