Wei_Dai comments on The Absent-Minded Driver - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (139)
Please go back to what I wrote before (I've changed the numbers to .2/.4/.4 below):
I'll go over the payoff calculation in detail, but if you're still confused after this, perhaps we should take it to private messages to avoid cluttering up the comments.
Your proposed strategy is to CONTINUE upon seeing the hint 'X' and EXIT upon seeing the hint 'Y'. With .4 probability, you'll get 'Y' at both intersections, but you EXIT upon seeing the first 'Y' so you get 0 payoff in that case. With .4 probability, you get 'X' at both intersections, so you choose CONTINUE both times and end up at C for payoff 1. With .2 probability, you get 'X' at X and 'Y' at Y, so you choose CONTINUE and then EXIT for a payoff of 4, thus:
.4 * 0 + .4 * 1 + .2 * 4 = 1.2
I'm not confused; I probably should have stopped you at your original derivation for the partial-knowledge case but didn't want to check your algebra. And setting up problems like these is important and tricky, so this discussion belongs here.
So, I think the problem with your setup is that you don't make the outcome space fully symmetric because you don't have an equal chance of drawing Y at X and X at Y (compared to your chance of drawing X at X and Y at Y).
To formalize it for the general case of partial knowledge, plus probabilistic knowledge given action, we need to look at four possibilities: Drawing XY, XX, YY, and YX, only the first of which is correct. If, as I defined it before, the probability of being right at any given exit is r, the corresponding probabilities are: r^2, r(1-r), r(1-r), and (1-r)(1-r).
So then I have the expected payoff as a function of p, q, and r as:
(r^2)(p*q + 4p*(1 - q)) + r(1 - r)(p^2 + 4p(1 - p)) +r(1 - r)(q^2 + 4q(1 - q)) + (1 - r)(1 - r)(p*q + 4q(1 - p))
This nicely explains the previous results:
The original problem is the case of complete ignorance, r=1/2, which has a maximum 4/3 where p and q are such that they average out to choosing "continue" at one's current intersection 2/3 of the time. (And this, I think, shows you how to correctly answer while explicitly and correctly representing your probability of being at a given intersection.)
The case of (always) continuing on (guessing) X and not continuing on (guessing) Y corresponds to p=1 and q=0, which reduces to r*(3r+1), the equation I originally had.
Furthermore, it shows how to beat the payoff of 4/3 when your r is under 52%. For 51% (the original case you looked at), the max payoff is 1.361 {p=1, q=0.306388} (don't know how to show it on Alpha, since you have to constrain p and q to 0 thru 1).
Also, it shows I was in error about the 52% threshold, and the mixed strategy actually dominates all the way up to about r = 61%, at which point of course p=1 and q has shrunk to zero (continue when you see X, exit when you see Y). This corresponds to 32 millibits, much higher than my earlier estimate of 1.3 millibits.
Interesting!