I intend to plunge into the decision theory of self-modifying decision systems and never look back. (And finish the decision theory and implement it and run the AI, at which point, if all goes well, we Win.) (This Week’s Finds (Week 311))
...
After all, if you had the complete decision process, you could run it as an AI, and I'd be coding it up right now. (Eliezer_Yudkowsky 12 October 2009 06:19:28PM)
Can this be interpreted the way that Eliezer Yudkowsky believes that he himself, or the SIAI, will not only define friendliness but actually implement it and run a fooming AI to take over the universe? If they really believe that and if it is likely that they can succeed, I still think that even given a very low probability of them being dishonest one should seriously consider how it can be guaranteed that the AI they run is actually friendly. Let me ask you people who believe that the SIAI can succeed, are you not worried at all about unfriendly humans? You just trust their words? That's really weird. If I don't misunderstand what he is saying in those two quotes above, or if he isn't joking, he's actually saying that he'll run a fooming AI.
I don't know whether there's any way to absolutely prove that SIAI will get it right (though I hope that if they come up with a proof of Friendliness they make it public), but I trust them more than their most likely competitors which I think would be governments.
First part: "This Week's Finds (Week 311)".
Second part: "This Week's Finds (Week 312)"