Lumifer comments on An investment analogy for Pascal's Mugging - Less Wrong Discussion
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (34)
Good point about Cauchy. If even the mean is undefined, all bets are off :-)
Can I get an example? Say, X is a random positive real number. For which distribution which parameters that maximize E(X) will not maximize E(log(X))?
I don't know about that. The Kelly Rule means a specific strategy in a specific setting and diluting and fuzzifying that specificity doesn't seem useful.
That is exactly what the Kelly criterion provides examples of. Let p be the probability of winning some binary bet and k the multiple of your bet that is returned to you if you win. Given an initial bankroll of 1, let theta be the proportion of it you are going to bet. Let the distribution of your bankroll after the bet be X. With probability p, X is 1+theta(k-1), and with probability 1-p, X is 1-theta. theta is a parameter of this distribution. (So are p and k, but we are interested in maximising over theta for given p and k.)
If pk > 1 then theta = 1 maximises E(X), but theta = (pk-1)/(k-1) maximises E(log(X)).
The graphs of E(X) and E(log(X)) as functions of theta look nothing like each other. The first is a linear ascending gradient, and the second rises to a maximum and then plunges to -∞.
Yep, I was wrong. Now I need to figure out why I thought I was right..
May have gotten confused because log is monotonically increasing e.g. log likelihood maximized at the same spot as likelihood. So log E(X) is maximized at the same spot as E(X). But log and E do not commute (Jensen's inequality is not called Jensen's equality, after all).
Was probably part of it -- I think the internal cheering for the wrong position included the words "But log likelihood!" :-/
Sure. So, just to be clear, the situation is: We have real-valued random variable X depending on a single real-valued parameter t. And I claim it is possible (indeed, usual) that the choice of t that maximizes E(log X) is not the same as the choice of t that maximizes E(X).
My X will have two possible values for any given t, both with probability 1/2. They are t exp t and exp -2t.
E(log X) = 1/2 (log(t exp t) + log(exp -2t)) = 1/2 (log t + t - 2t) = 1/2 (log t - t). This is maximized at t=1. (It's also undefined for t<=0; I'll fix that in a moment.)
E(X) is obviously monotone increasing for large positive t, so it's "maximized at t=+oo". (It doesn't have an actual maximum; I'll fix that in a moment.)
OK, now let me fix those two parenthetical quibbles. I said X depends on t, but actually it turns out that t = 100.5 + 100 sin u, where u is an angle (i.e., varies mod 2pi). Now E(X) is maximized when sin u = 1, so for u = pi2; and E(log X) is maximized when 100 sin u = -99.5, so for two values of u close to -pi/2. (Two local maxima, with equal values of E(log X).)
Okay, I accept that I'm wrong and you're right. Now the interesting part is that my mathematical intuition is not that great, but this is a pretty big fail even for it. So in between googling for crow recipes, I think I need to poke around my own mind and figure out which wrong turn did it happily take... I suspect I got confused about the expectation operator, but to confirm I'll need to drag my math intuition into the interrogation room and start asking it pointed questions.
Upvoted for public admission of error :-).
(In the unlikely event that I can help with the brain-fixing, e.g. by supplying more counterexamples to things, let me know.)