Searching for Impossibility Results or No-Go Theorems for provable safety.
I am looking for results showing that various approaches to provable safety are impossible or that such proofs are of a particular complexity class. I have Yampolskiy's paper "Impossibility Results in AI: A Survey," but I am looking for more targeted results that would help guide research into provable safety....
Sep 27, 20242