lukeprog comments on Introducing Corrigibility (an FAI research subfield) - LessWrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (28)
Not sure if this is what you're thinking of, but there's a research area called "adjustable autonomy" and a few other names, which superficially sounds similar but isn't actually getting at the problem described here, which comes about due to convergent instrumental values in sufficiently advanced agents.