Vladimir_Nesov comments on Decision Theories: A Semi-Formal Analysis, Part III - Less Wrong

23 Post author: orthonormal 14 April 2012 07:34PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (53)

You are viewing a single comment's thread. Show more comments above.

Comment author: Vaniver 15 April 2012 05:16:19PM 1 point [-]

TDT sortof behaves like that except that there is no actual communication going on.

Huh? How could X possibly run TDT without having Y's source code communicated to it?

Curious to see your arguments.

My model of TDT is that, rather than looking at action-action-outcome triplets, it looks at strategy-strategy-outcome triplets. The rest of game theory remains the same, and so once you have a strategy-strategy-outcome table, you find the Nash equilibrium and you're done. (If constructing that table fails, you revert to the Nash equilibrium from the action-action-outcome table.)

The material difference between this and regular game theory is that we now have access to strategies- i.e., we can read our opponent's mind, i.e. telepathy. (Maybe mind-reading decision theory is a better term, but it doesn't abbreviate to TDT like telepathic decision theory does.)

You can still run TDT in situations with time (like an iterated prisoner's dilemma), but you can't run TDT in situations where you don't have your opponent's source code. So calling it "timeless" when it can involve time seems odd, as is not referring to the necessity of source code.

Comment author: Vladimir_Nesov 15 April 2012 05:50:19PM *  3 points [-]

My model of TDT is that, rather than looking at action-action-outcome triplets, it looks at strategy-strategy-outcome triplets.

This characterization/analogy doesn't fit, and doesn't seem to help with making the necessary-knowledge-about-opponent distinction. Knowing that performing action A implies that your opponent performs action B is a weaker statement than unconditionally knowing that your opponent performs action B. In standard game theory, you don't know what action your opponent performs, and with TDT you don't know that as well. But not knowing something doesn't (automatically) make it not happen. So if there is indeed a dependence of your opponent's action on your own action, it's useful to know it.

The difference between considering opponent's actions in standard game theory and considering opponent's "strategy" (dependence of opponent's action on your action) is that while the former is usually unknown (to both TDT and standard game theory), the latter can in principle be known, and making use of this additional potential knowledge is what distinguishes TDT. So the actions in game theory and "strategies" in TDT are not analogous.

Comment author: Vaniver 15 April 2012 07:11:02PM 0 points [-]

Knowing that performing action A implies that your opponent performs action B is a weaker statement than unconditionally knowing that your opponent performs action B.

Okay. The first looks like a strategy to me, and the second looks like an action. Right?

In standard game theory, you don't know what action your opponent performs, and with TDT you don't know that as well.

I agree, and that matches my characterization of TDT.

But not knowing something doesn't (automatically) make it not happen. So if there is indeed a dependence of your opponent's action on your own action, it's useful to know it.

I'm not understanding this, though. Are you just saying that knowing about your opponent's strategy gives you useful information?

the latter can in principle be known,

How do you learn it?

So the actions in game theory and "strategies" in TDT are not analogous.

The analogy is that both of them get put into a table, and then you find the Nash equilibria by altering the favored rows and columns and columns of the table, and then you pick the best of the equilibria. (TDT has known bargaining problems, right? That looks like it maps onto disagreeing over which Nash equilibrium to pick.) Would it help if I made a walkthrough of my model with actual tables?

Comment author: Vladimir_Nesov 15 April 2012 07:21:25PM *  1 point [-]

Knowing that performing action A implies that your opponent performs action B is a weaker statement than unconditionally knowing that your opponent performs action B.

Okay. The first looks like a strategy to me, and the second looks like an action. Right?

Y doesn't act according to the rule "Let's see what X does. If it does A, I'm going to do B, etc.", and so it's misleading to call that "a strategy". This is only something like what X infers about Y, but this is not how Y reasons, because Y can't infer what X does, and so it can't respond depending on what X does.

The actual strategy is to figure out an action based on the other player's code, not to figure out an action based on the other player's action. This strategy, which doesn't involve responding to actions, can be characterized as establishing a dependence between players' actions, and this characterization is instrumental to the strategy itself, a part of what makes the characterization correct.

Comment author: Vaniver 15 April 2012 07:49:42PM 0 points [-]

Y doesn't act according to the rule "Let's see what X does. If it does A, I'm going to do B, etc.", and so it's misleading to call that "a strategy". This is only something like what X infers about Y, but this is not how Y reasons, because Y can't infer what X does, and so it can't respond depending on what X does.

So "play whatever I think X will play" does count as a strategy, but "play whatever X plays" does not count as a strategy because Y can't actually implement it. Limiting X and Y to the first sort of strategies was meant to be part of my characterization, but I could have made that clearer.

Comment author: Vladimir_Nesov 15 April 2012 08:02:56PM *  2 points [-]

So "play whatever I think X will play" does count as a strategy, but "play whatever X plays" does not count as a strategy because Y can't actually implement it.

It can't implement "play whatever I think X will play" either, because it doesn't know what X will play.

In one statement, if we are talking about ADT-like PD (the model of TDT in this post appears to be more complicated), Y could be said to choose the action such that provability of Y choosing that action implies X's choosing a good matching action. So Y doesn't act depending on what X does or what Y thinks X does etc., Y acts depending on what X can be inferred to do if we additionally assume that Y is doing a certain thing, and the thing we additionally assume Y to be doing is a specific action, not a strategy of responding to X's source code, or a strategy of responding to X's action. If you describe X's algorithm the same way, you can see that the additional assumption of Y's action is not what X uses in making its decision, for it similarly makes an additional assumption of its own (X's) action and then looks what can be inferred about Y's action (and not Y's "strategy").

Comment author: Vaniver 15 April 2012 08:09:16PM 0 points [-]

Y acts depending on what X can be inferred to do if we additionally assume that Y is doing a certain thing, and the thing we additionally assume Y to be doing is a specific action, not a strategy of responding to X's source code, or a strategy of responding to X's action.

Can you write the "cooperate iff I cooperate iff they cooperate ... " bot this way? I thought the strength of TDT was that it allowed that bot.

Comment author: Vladimir_Nesov 15 April 2012 08:23:06PM *  3 points [-]

Can you write the "cooperate iff I cooperate iff they cooperate ... " bot this way?

This can be unpacked as an algorithm that searches for a proof of the statement "If I cooperate, then my opponent also cooperates; if I defect, then my opponent also defects", and if it finds its proof, it cooperates. Under certain conditions, two players running something like this algorithm will cooperate. As you can see, agent's decision here depends not on the opponent's decision, but on the opponent's decision's dependence on your decision (and not dependence on the dependence of your decision on the opponent's decision, etc.).

Comment author: Vaniver 15 April 2012 08:42:37PM 1 point [-]

Okay. I think that fits with my view: so long as it's possible to go from X's strategy and Y's strategy to an outcome, then we can build a table of strategy-strategy-outcome triplets, and do analysis on that. (I built an example over here.) What I'm taking from this subthread is that the word "strategy" needs to have a particular meaning to be accurate, and so I need to be more careful when I use it so that it's clear that I'm conforming to that meaning.