Richard_Loosemore comments on Evaluating the feasibility of SI's plan - Less Wrong

25 Post author: JoshuaFox 10 January 2013 08:17AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (186)

You are viewing a single comment's thread. Show more comments above.

Comment author: Richard_Loosemore 10 January 2013 09:34:21PM 6 points [-]

So ... SI is addressing the question of whether the "friendliness" concept is actually meaningful enough to be formalizable? SI accepts that "friendliness" might not be formalizable at all, and has discussed the possibility that mathematical proof is not even applicable in this case?

And SI has discussed the possibility that the current paradigm for an AI motivation mechanism is so poorly articulated, and so unproven (there being no such mechanism that has been demonstrated to be even approaching stability), that it may be meaningless to discuss how such motivation mechanisms can be proven to be "friendy"?

I do not believe I have seen any evidence of those debates/discussions coming from SI... do you have pointers?

Comment author: Kaj_Sotala 11 January 2013 07:16:14AM 3 points [-]

Well, Luke has asked me to work on a document called "Mitigating Risks from AGI: Key Strategic Questions" which lists a number of questions we'd like to have answers to and attempts to list some preliminary pointers and considerations that would help other researchers actually answer those questions. "Can CEV be formalized?" and "How feasible is it to create Friendly AI along an Eliezer path?" are two of the questions in that document.

I haven't heard explicit discussions about all of your points, but I would expect them to all have been brought up in private discussions (which I have for the most part missed, since my physical location is rather remote from all the other SI folks). Eliezer has said that a Friendly AI in the style that he is thinking of might just be impossible. That said, I do agree with the current general consensus among other SI folk, which is to say that we should act based on the assumption that such a mathematical proof is possible, because humanity's chances of survival look pretty bad if it isn't.

Comment author: hairyfigment 10 January 2013 11:30:05PM 1 point [-]

They're currently working on a formal system for talking about stability, a reflective decision theory. If you wanted to prove that no such system can exist, what else would you be doing?