jimrandomh comments on Explicit Optimization of Global Strategy (Fixing a Bug in UDT1) - Less Wrong

17 Post author: Wei_Dai 19 February 2010 01:30AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (38)

You are viewing a single comment's thread.

Comment author: jimrandomh 19 February 2010 04:06:58AM *  0 points [-]

Suppose you're choosing a strategy S for a cooperation game with some other entity X, which you are told nothing about. Then U(S) = .5 * (S(1)!=X(2)) + .5 * (S(2)!=X(1)) In this case, you have to choose a probability distribution over other entities X, and choose S to optimize the utility function based on that. There's no way around that. If we're told that X was given the same utility function, and is trying to optimize over it, then that greatly narrows down the possibilities for what X is. We assume that X is chosen, by some unspecified but intelligent process, to also optimize U. Fortunately, English culture provides a standard mapping between numbers and letters (A=1, B=2, C=3, ...); so if we assume X has some probability of coming from a similar culture and choosing that mapping for that reason, and will choose an arbitrary random mapping otherwise, then we're better off with the standard mapping.

If the other agent has a different utility function, then that changes your probability distribution over what that agent is. If we're told that the other agent is supposed to implement the utility function "1 if it chooses A, 0 if it chooses B", then its implementation is probably going to be to just return A, so we should always return B.

Now assume that when we enter into the coordination game, we're told something about A, and A is told something about us. Then our utility function is U(S) = .5(S(1,X)!=X(2,S)) + .5(S(2,X)!=X(1,S)) We still need a probability distribution over Xs, but this time the distribution includes Xs that model S. If we're also told that X is supposed to be optimizing the same utility function, then we can assign some probability to it modeling S with each of various techniques, and to it being model-able with each of various techniques. Not all modeling techniques will work on all functions - some of them lead to infinite regress, some are just bad designs that can't model anything accurately, etc - so to maximize the probability of successful coordination we should both make S easy for X to model, and make S try to model X.

Different kinds of games lead to different kinds of likely opponents, hence the field of game theory. A nash equilibrium is any pair of strategies that optimize utility under the assumption that the other is their opponent.