endoself comments on The Friendly AI Game - Less Wrong

38 Post author: bentarm 15 March 2011 04:45PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (170)

You are viewing a single comment's thread. Show more comments above.

Comment author: endoself 16 March 2011 11:30:40PM 0 points [-]

I had misremembered something; I thought that there was a safeguard to ensure that it never tries to learn about its safeguards, rather than a prior making this unlikely.

Perfect safeguards are possible; in an extreme case, we could have a FAI monitoring every aspect of our first AI's behaviour. Can you give me a specific example of a safeguard so I can find a hole in it? :)