orthonormal comments on Decision Theories: A Semi-Formal Analysis, Part III - Less Wrong

23 Post author: orthonormal 14 April 2012 07:34PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (53)

You are viewing a single comment's thread. Show more comments above.

Comment author: orthonormal 15 April 2012 11:10:57PM *  2 points [-]

Aha, I see now what you mean. Good insight!

[EDIT: The following is false.] A clever CDT would be able to act like TDT if it considered, not the choice of whether to output C or D, but the choice of which mathematical object to output (because it could output a mathematical object that evaluates to C or D in a particular way depending on the code of Y—this gives it the option of genuinely acting like TDT would).

This has the interesting conclusion that even without the benefit of self-modification, a CDT agent with a good model of the world ends up acting more like TDT than traditional game theorists would expect. (Another example of this is here.) The version of CDT in the last post, contrariwise, is equipped with a very narrow model of the world and of its options. [End falsehood.]

I think these things are fascinating, but I think it's important to show that you can get TDT behavior without incorporating anthropic reasoning, redefinition of its actions, or anything beyond a basic kind of framework that human beings know how to program.

(By the way, I wouldn't call option 3 CliqueBot, because CliqueBots as I defined them have problems mutually cooperating with anything whose outputs aren't identical to theirs. I think it's better for Option 3 to be the TDT algorithm defined in the post.)

Comment author: Will_Newsome 16 April 2012 06:50:51AM *  4 points [-]

It seems to come up all the time that people aren't aware that CDT with a sufficiently good world model (a sufficiently accurate causal graph) is the same as TDT, even though this has been known for years. If you could address that somewhere in your sequence I think you'd save a lot of people a lot of time—it's the most common objection to standard discourse about decision theory that I've seen.

Comment author: orthonormal 16 April 2012 03:25:32PM 1 point [-]

I'll discuss it in the final post.

Comment author: Vaniver 16 April 2012 01:35:49AM 0 points [-]

Good insight!

Thanks!

This has the interesting conclusion that even without the benefit of self-modification, a CDT agent with a good model of the world ends up acting more like TDT than traditional game theorists would expect.

This is a pretty common feature of comparisons between decision theories: different outcomes generally require different assumptions.

I think these things are fascinating, but I think it's important to show that you can get TDT behavior without incorporating anthropic reasoning, redefinition of its actions, or anything beyond a basic kind of framework that human beings know how to program.

It's not clear to me what the difference is between the TDT algorithm in your post and the method I've described. You need some method of determining what the outcome pair is from strategy pair, and the inference module can (hopefully) do that. The u_f that you use is the utility of the X outcome corresponding to the best Y outcome in row f, and picking the best of those corresponds to finding the best of the Nash equilibria (in the absence of bargaining problems). The only thing I don't mention is the sanity check, but that should just be another run of the inference module.

By the way, I wouldn't call option 3 CliqueBot, because CliqueBots as I defined them have problems mutually cooperating with anything whose outputs aren't identical to theirs. I think it's better for Option 3 to be the TDT algorithm defined in the post.

Sure, but does it have a short name? ProofBot?

(Notice that Y running the full TDT algorithm corresponds to there being multiple columns in the table: if you were running X against a CooperateBot, you'd just have the first column, and the Nash equilibrium would be (2,1) or (3,1). If you were running it against CliqueBot without a sanity check, there would just be the third column, and it would think (3,3) was the Nash equilibrium, but would be in for a nasty surprise when CliqueBot rejects it because of its baggage.)

Comment author: orthonormal 16 April 2012 03:52:13PM 1 point [-]

It's not clear to me what the difference is between the TDT algorithm in your post and the method I've described.

If you make sure to include a sanity check, then your description should do the same thing as the TDT algorithm in the post (at least on simple games; there may be a difference in bargaining situations.)

Sure, but does it have a short name? ProofBot?

I understand why you might feel it's circular to name that row TDT, but nothing simpler (unless you count ADT/UDT as simpler) does as it does. It's a layer more complicated than Newcomblike agents (which should also be included in your table); in order to get mutual cooperation with self and also defection against CooperateBot, it deduces whether a DefectBot or a MimicBot (C if it deduces Y=C, D otherwise) has a better outcome against Y, runs a sanity check, and if that goes through it does what the preferred strategy does.