Eliezer_Yudkowsky comments on Towards a New Decision Theory - Less Wrong

50 Post author: Wei_Dai 13 August 2009 05:31AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (142)

You are viewing a single comment's thread. Show more comments above.

Comment author: Eliezer_Yudkowsky 16 August 2009 07:26:10AM *  2 points [-]

Wei, the whole point of TDT is that it's not necessary for me to insert special cases into the code for situations like this. Under any situation in which I should program the AI to defect against the paperclipper, I can write a simple TDT agent and it will decide to defect against the paperclipper.

TDT has that much meta-power in it, at least. That's the whole point of using it.

(Though there are other cases - like the timeless decision problems I posted about that I still don't know how to handle - where I can't make this statement about the TDT I have in hand; but this is because I can't handle those problems in general.)

Comment author: cousin_it 17 August 2009 03:14:01PM *  2 points [-]

TDT has that much meta-power in it, at least.

...How much power, exactly?

Given an arbitrary, non-symmetric, one-shot, two-player game with non-transferable utility (your payoffs are denominated in human lives, the other guy's in paperclips), and given that it's common knowledge to both agents that they're using identical implementations of your "TDT", how do we calculate which outcome gets played?

Comment author: Wei_Dai 16 August 2009 07:33:06AM *  0 points [-]

Under any situation in which I should program the AI to defect against the paperclipper, I can write a simple TDT agent and it will decide to defect against the paperclipper.

So, what is that simple TDT agent? You seemed to have ignored my argument that it can't exist, but if you can show me the actual agent (and convince me that it would defect against the paperclipper if that's not obvious) then of course that would trump my arguments.

ETA: Never mind, I figured this out myself. See step 11 of http://lesswrong.com/lw/15m/towards_a_new_decision_theory/11lx