paulfchristiano comments on What can you do with an Unfriendly AI? - Less Wrong

16 Post author: paulfchristiano 20 December 2010 08:28PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (127)

You are viewing a single comment's thread. Show more comments above.

Comment author: shokwave 21 December 2010 12:58:10AM 0 points [-]

Your comment made me re-read the post more carefully. I had on first reading assumed that a truthful answer was rewarded (whether yes or no) and a lying answer was punished. If a yes is rewarded and a no is punished, and our AI genies are so afraid of termination that they would never give a 'no' where they could give a 'yes', why wouldn't they all give us 'yes'?

Comment author: paulfchristiano 21 December 2010 01:33:17AM 1 point [-]

Recall the filter between the AI and the world. The AI doesn't directly say "yes/no." The AI gives a proof to the filter which then says either "yes, the AI found a proof" or "no, the AI didn't." So they do all say 'yes' if they can. I will modify the post to be more clear.