You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

RobbBB comments on A few misconceptions surrounding Roko's basilisk - Less Wrong Discussion

39 Post author: RobbBB 05 October 2015 09:23PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (125)

You are viewing a single comment's thread. Show more comments above.

Comment author: RobbBB 14 October 2015 10:05:44PM *  0 points [-]

One way of putting this is that the AI, once it exists, can convincingly trick people into thinking it will cooperate in Prisoner's Dilemmas; but since we know it has this property and we know it prefers (D,C) over (C,C), we know it will defect. This is consistent because we're assuming the actual AI is powerful enough to trick people once it exists; this doesn't require the assumption that my low-fidelity mental model of the AI is powerful enough to trick me in the real world.

For acausal blackmail to work, the blackmailer needs a mechanism for convincing the blackmailee that it will follow through on its threat. 'I'm a TDT agent' isn't a sufficient mechanism, because a TDT agent's favorite option is still to trick other agents into cooperating in Prisoner's Dilemmas while they defect.

Comment author: VoiceOfRa 15 October 2015 08:30:38PM 1 point [-]

One way of putting this is that the AI, once it exists, can convincingly trick people into thinking it will cooperate in Prisoner's Dilemmas

Except it needs to convince the people who are around before it exists.