MIRI's reading list on corrigbility seems out dated, and I can't find a centralised list Does anyone have, or know of, one?
As a side note, has MIRI stopped updating their reading list? It seems like that's the case.
EDIT:
Links given in the comment section to do with corrigibility. I'll try and update this with some summaries as I read them.
https://arbital.com/p/corrigibility/
The CHAI reading list is also fairly out of date (last updated april 2017) but has a few more papers, especially if you go to the top and select [3] or [4] so it shows lower-priority ones.
(And in case others haven't seen it, here's the MIRI reading guide for learning agent foundations.)