AlephNeil comments on What is Wei Dai's Updateless Decision Theory? - Less Wrong

37 Post author: AlephNeil 19 May 2010 10:16AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (63)

You are viewing a single comment's thread. Show more comments above.

Comment author: timtyler 19 May 2010 12:50:02PM *  0 points [-]

Re: "The above procedure is extremely limited. Taking it exactly as stated, it only applies to games with a single player and a single opportunity to act at some stage in the game."

I don't really see what you mean. Your "naive" decision theory updated on sensory input - and then maximised expected utility. That seems like standard decision theory to me - and surely it works fine with multiple actors and iterated interactions.

Comment author: AlephNeil 19 May 2010 01:24:03PM *  0 points [-]

It's not literally true that the procedure I described knows how to deal with multiple actors. For instance, if "Player 2" is going to act after me, then in order to calculate expected utilities for my actions, I need to have some idea of what Player 2 is going to do. Now, given some particular game, it may or may not be true that there's a straightforward way to divine what Player 2 is going to do, but until we're given such a method, the Bayesian procedure I've described as 'NDT' is stuck.

You might think "well, if Player 2 is the last person to act then surely we can apply 'NDT' to work out Player 2's best decision and then work back to Player 1's decision." But again, this isn't literally true, because we can't calculate a likelihood function unless we know Player 1's strategy.

(However, if Player 2 knows what Player 1's move was then they can calculate a likelihood function after all. This case is similar to the one I describe where a non-forgetful Player plays several times.)

Comment author: timtyler 19 May 2010 01:55:24PM 1 point [-]

Player 2 is part of Player 1's environment. Player 1 calculates their actions in the same way as they calculate the response of the rest of the environment to their actions - by using their model of how the rest of the world behaves.

Comment author: AlephNeil 19 May 2010 02:05:15PM 1 point [-]

OK, but typically we're given no information about how Player 2 thinks and simply told what their utility function is. In other words, our 'model' of Player 2 just says "here is a rational actor who likes these outcomes this much."

Now if we can use that 'model' of Player 2 to work out what they're going to do, then of course we're in great shape, but that just means we're solving Player 2's decision problem. So in order to use 'NDT' we first need (possibly some other) decision theory to predict Player 2's action.

Comment author: timtyler 19 May 2010 02:25:15PM *  1 point [-]

Agents may have even less information about other aspects of the world - and may be in an even worse position to make predictions about them. Basically agents have to decide what to do in the face of considerable uncertainty.

Anyway, this doesn't seem like a problem with conventional decision theory to me.

Comment author: AlephNeil 19 May 2010 02:30:10PM *  1 point [-]

Of course it's not - all that's happened is that I've described a certain procedure that bears a very tenuous relation to 'decision theory', and noted that this procedure is unable to do certain things.