Eliezer_Yudkowsky comments on Introducing Corrigibility (an FAI research subfield) - LessWrong

29 Post author: So8res 20 October 2014 09:09PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (28)

You are viewing a single comment's thread. Show more comments above.

Comment author: Eliezer_Yudkowsky 22 October 2014 09:35:12PM 5 points [-]

No, at least not anything like the corrigibility we're currently considering. Everything we've written about so far relies on having the ability to specify the utility function in detail, the utility function being reflectively stable, the utility function being able to contain references to external objects like 'the shutdown button' with the corresponding problems of adapting to new ontologies as the surrounding system shifts representations (see the notion of an 'ontological crisis'), etcetera. It's a precaution for a Friendly AI in the process of being built; you couldn't tack it onto super-Eurisko.