Eliezer_Yudkowsky comments on Towards a New Decision Theory - Less Wrong

50 Post author: Wei_Dai 13 August 2009 05:31AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (142)

You are viewing a single comment's thread. Show more comments above.

Comment author: Eliezer_Yudkowsky 16 August 2009 05:08:53AM 3 points [-]

I don't understand why you want the AIs to defect against each other rather than cooperating with each other.

Are you attached to this particular failure of causal decision theory for some reason? What's wrong with TDT agents cooperating in the Prisoner's Dilemma and everyone living happily ever after?

Comment author: Wei_Dai 16 August 2009 07:22:55AM *  1 point [-]

I don't understand why you want the AIs to defect against each other rather than cooperating with each other.

Come on, of course I don't want that. I'm saying that is the inevitable outcome under the rules of the game I specified. It's just like if I said "I don't want two human players to defect in one-shot PD, but that is what's going to happen."

ETA: Also, it may help if you think of the outcome as the human players defecting against each other, with the AIs just carrying out their strategies. The human players are the real players in this game.

Are you attached to this particular failure of causal decision theory for some reason?

No, I can't think of a reason why I would be.

What's wrong with TDT agents cooperating in the Prisoner's Dilemma and everyone living happily ever after?

There's nothing wrong with that, and it may yet happen, if it turns out that the technology for proving source code can be created. But if you can't prove that your source code is some specific string, if the only thing you have to go on is that you and the other AI must both use the same decision theory due to convergence, that isn't enough.

Sorry if I'm repeating myself, but I'm hoping one of my explanations will get the point across...

Comment author: Vladimir_Nesov 16 August 2009 11:07:57AM *  2 points [-]

Come on, of course I don't want that. I'm saying that is the inevitable outcome under the rules of the game I specified. It's just like if I said "I don't want two human players to defect in one-shot PD, but that is what's going to happen."

I don't believe that is true. It's perfectly conceivable that two human players would cooperate.

Comment author: Wei_Dai 16 August 2009 12:33:36PM 0 points [-]

Yes, I see the possibility now as well, although I still don't think it's very likely. I wrote more about it in http://lesswrong.com/lw/15m/towards_a_new_decision_theory/11lx