jimrandomh comments on Introducing Corrigibility (an FAI research subfield) - LessWrong

29 Post author: So8res 20 October 2014 09:09PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (28)

You are viewing a single comment's thread. Show more comments above.

Comment author: jimrandomh 20 October 2014 11:39:38PM 6 points [-]

Yes. It's not a sure-fire safeguard and it doesn't work against all UFAIs, but if done correctly, you can think of corrigibility as granting a saving throw. But note that while this paper is a huge step forward, "how to do corrigibility correctly" is not nearly a solved problem yet.

(Corrigibility was a topic at the second MIRIx Boston workshop, and we have results that build on this paper which we are working on writing up.)