RolfAndreassen comments on The Backup Plan - Less Wrong

1 Post author: Luke_A_Somers 13 October 2011 07:53PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (35)

You are viewing a single comment's thread. Show more comments above.

Comment author: RolfAndreassen 15 October 2011 02:47:53AM 3 points [-]

I was simplifying the rather complex concept of extrapolated volition to fit it in one sentence.

An AI which not only notices that its friendliness is not invariant, but decides to modify in the direction of invariant Friendliness, is already Friendly. An AI which is able to modify itself to invariant Friendliness without unacceptable compromise of its existing goals is already Friendly. You're assuming away the hard work.

Comment author: Luke_A_Somers 15 October 2011 06:54:41PM *  2 points [-]

"already friendly"? You're acting as if its state doesn't depend on its environment.

Are there elements of the environment that could determine whether a given AI's successor is friendly or not? I would say 'yes'.

This is after one has already done the hard work of making an AI that even has the potential to be friendly, but you messed up on that one crucial bit. This is a saving throw, a desperate error handler, not the primary way forward. By saying 'backup plan' I don't mean, 'if Friendly AI is hard, let's try this', I mean 'Could this save us from being restrained and nannied for eternity?'

Comment author: RolfAndreassen 15 October 2011 07:19:22PM 2 points [-]

I shudder to think that any AI's final goals could be so balanced that random articles on the Web of a Thousand Lies could push it one way or the other. I'm of the opinion that this is a fail, to be avoided at all costs.