MichaelHoward comments on Cryptographic Boxes for Unfriendly AI - Less Wrong

24 Post author: paulfchristiano 18 December 2010 08:28AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (155)

You are viewing a single comment's thread.

Comment author: MichaelHoward 18 December 2010 01:26:48PM 2 points [-]

Way in over my head here, but if the project is done in a series of steps with a series of boxed AIs helping to work on the next step, wouldn't it be better get something that is at the very least(*)…

  • always trying to be friendly.
  • always trying to be completely honest and non-evasive (to the programmers).
  • pretty damn stable.
  • pretty damn good at avoid killer-Gandhi pills, lost purposes, etc..
  • will alert & stop itself if it thinks it's drifting on any of the above.

..at a much earlier step than either…

  • being smart enough to outwit the programmers in dangerous ways.
  • being capable & allowed to do potentially foomy self-improvements.
  • doing any work on the next generation itself, as oppose to offering advice.
  • being able to talk to anyone who hasn't been vetted.

(*) - if not perfectly/certifiably, if that's feasible before doing stuff on the second list.

Comment author: paulfchristiano 18 December 2010 07:09:15PM 1 point [-]

I agree completely. I don't think that renders a strong quarantine system useless though, because we don't really get to decide whether we find FAI or uFAI first, whoever discovers the first AI does.