kokotajlod comments on New Pascal's Mugging idea for potential solution - Less Wrong Discussion
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (18)
Update: The conclusion of that article is that the expected utilities don't converge for any utility function that is bounded below by a computable, unbounded utility function. That might not actually be in conflict with the idea I'm grasping at here.
The idea I'm trying to get at here is that maybe even if EU doesn't converge in the sense of assigning a definite finite value to each action, maybe it nevertheless ranks each action as better or worse than the others, by a certain proportion.
Toy model:
The only hypotheses you consider are H1, H2, H3, ... etc. You assign 0.5 probability to H1, and each HN+1 has half the probability of the previous hypothesis, HN.
There are only two possible actions: A or B. H1 says that A gives you 2 utils and B gives you 1. Each HN+1 says that A gives you 10 times as many utils as it did under the previous hypothesis, HN, and moreover that B gives you 5 times as many utils as it did under the previous hypothesis, HN.
In this toy model, expected utilities do not converge, but rather diverge to infinity, for both A and B.
Yet clearly A is better than B...
I suppose one could argue that the expected utility of both A and B is infinite and thus that we don't have a good reason to prefer A to B. But that seems like a problem with our ability to handle infinity, rather than a problem with our utility function or hypothesis space.
In your example, how much should you spend to choose A over B? Would you give up an unbounded amount of utility to do so?
This was helpful, thanks!
As I understand it, you are proposing modifying the example so that on some H1 through HN, choosing A gives you less utility than choosing B, but then thereafter choosing B is better, because there is some cost you pay which is the same in each world.
It seems like the math tells us that any price would be worth it, that we should give up an unbounded amount of utility to choose A over B. I agree that this seems like the wrong answer. So I don't think whatever I'm proposing solves this problem.
But that's a different problem than the one I'm considering. (In the problem I'm considering, choosing A is better in every possible world.) Can you think of a way they might be parallel--any way that the "I give up" which I just said above applies to the problem I'm considering too?
The problem there, and the problem with Pascal's Mugging in general, is that outcomes with a tiny amount of probability dominate the decisions. A could be massively worse than B 99.99999% of the time, and still naive utility maximization says to pick B.
One way to fix it is to bound utility. But that has its own problems.
The problem with your solution is that it's not complete in the formal sense: you can only say some things are better than other things if they strictly dominate them, but if neither strictly dominates the other you can't say anything.
I would also claim that your solution doesn't satisfy framing invariants that all decision theories should arguably follow. For example, what about changing the order of the terms? Let us reframe utility as after probabilities, so we can move stuff around without changing numbers. E.g. if I say utility 5, p:.01, that really means you're getting utility 500 in that scenario, so it adds 5 total in expectation. Now, consider the following utilities:
1<2 p:.5
2<3 p:.5^2
3<4 p:.5^3
n<n+1 p:.5^n
...
etc. So if you're faced with choosing between something that gives you the left side or the right side, choose the right side.
But clearly re-arranging terms doesn't change the expected utility, since that's just the sum of all terms. So the above is equivalent to:
1>0 p:.5
2>0 p:.5^2
3>2 p:.5^3
4>3 p:.5^4
n>n-1 p:.5^n
So your solution is inconsistent if it satisfies the invariant of "moving around expected utility between outcomes doesn't change the best choice".
Again, thanks for this.
"The problem with your solution is that it's not complete in the formal sense: you can only say some things are better than other things if they strictly dominate them, but if neither strictly dominates the other you can't say anything."
As I said earlier, my solution is an argument that in every case there will be an action that strictly dominates all the others. (Or, weaker: that within the set of all hypotheses of probability less than some finite N, one action will strictly dominate all the others, and that this action will be the same action that is optimal in the most probable hypothesis.) I don't know if my argument is sound yet, but if it is, it avoids your objection, no?
I'd love to understand what you said about re-arranging terms, but I don't. Can you explain in more detail how you get from the first set of hypotheses/choices (which I understand) to the second?
I just moved the right hand side down by two spaces. The sum still stays the same, but the relative inequality flips.
Why would you think that? I don't really see where you argued for that, could you point me at the part of your comments that said that?