singularitard comments on Introducing Corrigibility (an FAI research subfield) - LessWrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (28)
This sounds familiar. Are you aware of other similar concepts previously communicated elsewhere? I feel certain I've read something along these lines before. By all means, claim it's original though.
Not sure if this is what you're thinking of, but there's a research area called "adjustable autonomy" and a few other names, which superficially sounds similar but isn't actually getting at the problem described here, which comes about due to convergent instrumental values in sufficiently advanced agents.