lukeprog comments on Towards a New Decision Theory - Less Wrong

50 Post author: Wei_Dai 13 August 2009 05:31AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (142)

You are viewing a single comment's thread. Show more comments above.

Comment author: Nick_Tarleton 16 August 2009 09:15:40AM *  2 points [-]

So what you're saying is, given two players who can successfully build AIs with their preferences (and that's common knowledge), they will likely (surely?) play cooperate in one-shot PD against each other. Do I understand you correctly?

Yes.

Suppose what you say is correct, that the Winning Thing is to play cooperate in one-shot PD. Then what happens when some player happens to get a brain lesion that causes him to unconsciously play defect without affecting his AI building abilities? He would take everyone else's lunch money. Or if he builds his AI to play defect while everyone else builds their AIs to play cooperate, his AI then takes over the world. I hope that's a sufficient reductio ad absurdum.

Good idea. Hmm. It sounds like this is the same question as: what if, instead of "TDT with defection patch" and "pure TDT", the available options are "TDT with defection patch" and "TDT with tiny chance of defection patch"? Alternately: what if the abstract computations that are the players have a tiny chance of being embodied in such a way that their embodiments always defect on one-shot PD, whatever the abstract computation decides?

It seems to me that Lesion Man just got lucky. This doesn't mean people can win by giving themselves lesions, because that's deliberately defecting / being an abstract computation that defects, which is bad. Whether everyone else should defect / program their AIs to defect due to this possibility depends on the situation; I would think they usually shouldn't. (If it's a typical PD payoff matrix, there are many players, and they care about absolute, not relative, scores, defecting isn't worth it even if it's guaranteed there'll be one Lesion Man.)

This still sounds disturbingly like envying Lesion Man's mere choices – but the effect of the lesion isn't really his choice (right?). It's only the illusion of unitary agency, bounded at the skin rather than inside the brain, that makes it seem like it is. The Cartesian dualism of this view (like AIXI, dropping an anvil on its own head) is also disturbing, but I suspect the essential argument is still sound, even as it ultimately needs to be more sophisticated.