I'll keep this quick:
In general, the problem presented by the Mugging is this: As we examine the utility of a given act for each possible world we could be in, in order from most probable to least probable, the utilities can grow much faster than the probabilities shrink. Thus it seems that the standard maxim "Maximize expected utility" is impossible to carry out, since there is no such maximum. When we go down the list of hypotheses multiplying the utility of the act on that hypothesis, by the probability of that hypothesis, the result does not converge to anything.
Here's an idea that may fix this:
For every possible world W of complexity N, there's another possible world of complexity N+c that's just like W, except that it has two parallel, identical universes instead of just one. (If it matters, suppose that they are connected by an extra dimension.) (If this isn't obvious, say so and I can explain.)
Moreover, there's another possible world of complexity N+c+1 that's just like W except that it has four such parallel identical universes.
And a world of complexity N+c+X that has R parallel identical universes, where R is the largest number that can be specified in X bits of information.
So, take any given extreme mugger hypothesis like "I'm a matrix lord who will kill 3^^^^3 people if you don't give me $5." Uncontroversially, the probability of this hypothesis will be something much smaller than the probability of the default hypothesis. Let's be conservative and say the ratio is 1 in a billion.
(Here's the part I'm not so confident in)
Translating that into hypotheses with complexity values, that means that the mugger hypothesis has about 30 more bits of information in it than the default hypothesis.
So, assuming c is small (and actually I think this assumption can be done away with) there's another hypothesis, equally likely to the Mugger hypothesis, which is that you are in a duplicate universe that is exactly like the universe in the default hypothesis, except with R duplicates, where R is the largest number we can specify in 30 bits.
That number is very large indeed. (See the Busy Beaver function.) My guess is that it's going to be way way way larger than 3^^^^3. (It takes less than 30 bits to specify 3^^^^3, no?)
So this isn't exactly a formal solution yet, but it seems like it might be on to something. Perhaps our expected utility converges after all.
Thoughts?
(I'm very confused about all this which is why I'm posting it in the first place.)
This was helpful, thanks!
As I understand it, you are proposing modifying the example so that on some H1 through HN, choosing A gives you less utility than choosing B, but then thereafter choosing B is better, because there is some cost you pay which is the same in each world.
It seems like the math tells us that any price would be worth it, that we should give up an unbounded amount of utility to choose A over B. I agree that this seems like the wrong answer. So I don't think whatever I'm proposing solves this problem.
But that's a different problem than the one I'm considering. (In the problem I'm considering, choosing A is better in every possible world.) Can you think of a way they might be parallel--any way that the "I give up" which I just said above applies to the problem I'm considering too?
The problem there, and the problem with Pascal's Mugging in general, is that outcomes with a tiny amount of probability dominate the decisions. A could be massively worse than B 99.99999% of the time, and still naive utility maximization says to pick B.
One way to fix it is to bound utility. But that has its own problems.
The problem with your solution is that it's not complete in the formal sense: you can only say some things are better than other things if they strictly dominate them, but if neither strictly dominates the other you can't say anything.
I wou... (read more)