A mugger appears and says "For $5 I'll offer you a set of deals from which you can pick any one. Each deal, d(N), will be N bits in length and I guarantee that if you accept d(N) I will run UTM(d(N)) on my hypercomputer, where UTM() is a function implementing a Universal Turing Machine. If UTM(d(N)) halts you will increase your utility by the number of bits written to the tape by UTM(d(N)). If UTM(d(N)) does not halt, I'll just keep your $5. Which deal would you like to accept?"
The expected increase in utility of any deal is p(d(N)) * U(UTM(d(N)), where p(d(N)) is the probability of accepting d(N) and actually receiving as many utilons as the number of bits a halting UTM(d(N)) writes to its tape. A non-empty subset of UTM programs of length N will write BB(N) bits to the tape where BB(X) is the busy-beaver function for programs of bit length X. Since BB(X) >= UTM(F) for any function F of bit length X, for every finite agent there is some N for which p(UTM(d(N)) = BB(N)) * BB(N) > 0. To paraphrase: Even though the likelihood of being offered a deal that actually yields BB(N) utilons is incredibly small, the fact that BB(X) grows at least as fast as any function of length X means that, at minimum, an agent that can be emulated on a UTM by a program of M bits cannot provide a non-zero probability of d(M) such that the expected utility of accepting d(M) is negative. In practice N can probably be much less than M.
Since p("UTM(d(X)) = BB(X)") >= 2^-X for d(X) with bits selected at random it doesn't make sense for the agent to assign p(d(X))=0 unless the agent has other reasons to absolutely distrust the mugger. For instance, discounting the probability of a deal based on a function of the promised number of utilons won't work; no discounting function grows as fast as BB(X) and an agent can't compute an arbitrary UTM(d(X)) to get a probability estimate without hypercomputational abilities. Any marginal-utility calculation fails in a similar manner.
I'm not sure where to go from here. I don't think it's rational to spend the rest of my life trying to find the largest integer I can think of to acausally accept d(biggest-integer) from some Omega. So far the strongest counterargument I've been able to think of is attempting to manage the risk of accepting the mugging by attempting to buy insurance of some sort. For example, a mugger offering intractably large amounts of utility for $5 shouldn't mind offering the agent a loan for $5 (or even $10,000) if the agent can immediately pay it back with astronomical amounts of interest out of the wealth that would almost certainly become available if the mugger fulfilled the deal. In short, it doesn't make sense to exchange utility now for utility in the future *unless* the mugger will accept what is essentially a counter-mugging that yields more long term utility for the mugger at the cost of some short term disutility. The mugger should have some non-zero probability, p, for which zhe is indifferent between p*"have $10 after fulfilling the deal" and (1-p)*"have $5 now". If the mugger acts like p=0 for this lottery, why can't the agent?
The scariest version of Pascal's mugging is the mugger-less one.
Very many hypotheses -- arguably infinitely many -- can be formed about how the world works. In particular, some of these hypotheses imply that by doing something counter-intuitive in following those hypothesis, you get ridiculously awesome outcomes. For example, even in advance of me posting this comment, you could form the hypothesis "if I send Kindly $5 by Paypal, he or she will refrain from torturing 3^^^3 people in the matrix and instead give them candy."
Now, usually all such hypotheses are low-probability and that decreases the expected benefit from performing these counter-intuitive actions. But how can you show that in all cases this expected benefit is sufficiently low to justify ignoring it?
Right, this is the real core of Pascal's Mugging (I was somewhat surprised that Bostrom didn't put it into his mainstream writeup). For aggregative utility functions over a model of the environment which e.g. treat all sentient beings (or all paperclips) as having equal value without diminishing marginal returns, and all epistemic models which induce simplicity-weighted explanations of sensory experience, all decisions will be dominated by tiny variances in the probability of extremely unlikely hypotheses because the "model size" of a hypothesis... (read more)