Wei_Dai comments on Q&A with Abram Demski on risks from AI - Less Wrong

22 Post author: XiXiDu 17 January 2012 09:43AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (70)

You are viewing a single comment's thread. Show more comments above.

Comment author: Wei_Dai 04 March 2012 08:07:08PM *  5 points [-]

Interesting! Have you written about this idea in more detail elsewhere? Here are my concerns about it:

  1. The AI has to infer the human's goals. Given the assumed/required cognitive limitations, it may not do a particularly good job of this.
  2. What if the human doesn't fully understand his or her own goals? What does the AI do in that situation?
  3. The AI could do something like plant a hidden time-bomb in its own code, so that its goal system reverts from the post-modification "close to humans" back to its original goals at some future time when it's no longer punishable by humans.

Given these problems and the various requirements on the AI for it to be successfully socialized, I don't understand why you assign only 0.1 probability to the AI not being socialized.