pjeby comments on Introducing Corrigibility (an FAI research subfield) - LessWrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (28)
Wow. This is the simplest/shortest explanation I've seen yet for how AI can becomes unfriendly, without reference to Terminator-style outcomes.
Of course, per the illusion of transparency, it may be that it only seems so clear to me because of my long term exposure to the idea of FAI... Still, it looks like an important step in subdividing the problem, and one that I expect would be more intuitively obvious to outsiders: "we're studying ways to make sure the sorcerer's apprentice can turn the magic mop off." ;-)