Comment author:Kindly
21 February 2013 03:48:08PM
12 points
[-]

The scariest version of Pascal's mugging is the mugger-less one.

Very many hypotheses -- arguably infinitely many -- can be formed about how the world works. In particular, some of these hypotheses imply that by doing something counter-intuitive in following those hypothesis, you get ridiculously awesome outcomes. For example, even in advance of me posting this comment, you could form the hypothesis "if I send Kindly $5 by Paypal, he or she will refrain from torturing 3^^^3 people in the matrix and instead give them candy."

Now, usually all such hypotheses are low-probability and that decreases the expected benefit from performing these counter-intuitive actions. But how can you show that in all cases this expected benefit is sufficiently low to justify ignoring it?

Right, this is the real core of Pascal's Mugging (I was somewhat surprised that Bostrom didn't put it into his mainstream writeup). For aggregative utility functions over a model of the environment which e.g. treat all sentient beings (or all paperclips) as having equal value without diminishing marginal returns, and all epistemic models which induce simplicity-weighted explanations of sensory experience, all decisions will be dominated by tiny variances in the probability of extremely unlikely hypotheses because the "model size" of a hypothesis can grow Busy-Beaver faster than its Kolmogorov complexity. (I've deleted a nearby comment from a known troll about how opposite hypotheses ought to cancel out. Unless this is a literal added axiom - and a false one at that which will produce poorly calibrated probabilities - there is no reason for all the Busy-Beaver sized hypotheses to have consequences which cancel each other out exactly down to the Busy-Beaverth decimal place, and continue to cancel regardless of what random bits of evidence and associated updates we run into throughout life. This presumes properties of both reality and the utility function which are very unlikely.)

Note that this actually happens without allowing that the mugger has a finite probability of possessing a hypercomputer - it follows just from trying to assign non-zero probability to the mugger possessing any Turing machine. It should also be noted that assigning probability literally zero means you will never believe the mugger can do something regardless of what evidence they show you that they are Matrix Lords. Similarly if we use Hanson's anthropic solution, we will never believe the mugger can put us into a vastly special position without log(vast) amounts of evidence.

## Comments (21)

BestThe scariest version of Pascal's mugging is the mugger-less one.

Very many hypotheses -- arguably infinitely many -- can be formed about how the world works. In particular, some of these hypotheses imply that by doing something counter-intuitive in following those hypothesis, you get ridiculously awesome outcomes. For example, even in advance of me posting this comment, you could form the hypothesis "if I send Kindly $5 by Paypal, he or she will refrain from torturing 3^^^3 people in the matrix and instead give them candy."

Now, usually all such hypotheses are low-probability and that decreases the expected benefit from performing these counter-intuitive actions. But how can you show that in all cases this expected benefit is sufficiently low to justify ignoring it?

Right, this is the real core of Pascal's Mugging (I was somewhat surprised that Bostrom didn't put it into his mainstream writeup). For aggregative utility functions over a model of the environment which e.g. treat all sentient beings (or all paperclips) as having equal value without diminishing marginal returns, and all epistemic models which induce simplicity-weighted explanations of sensory experience, all decisions will be dominated by tiny variances in the probability of extremely unlikely hypotheses because the "model size" of a hypothesis can grow Busy-Beaver faster than its Kolmogorov complexity. (I've deleted a nearby comment from a known troll about how opposite hypotheses ought to cancel out. Unless this is a literal added axiom - and a

falseone at that which will produce poorly calibrated probabilities - there is no reason for all the Busy-Beaver sized hypotheses to have consequences which cancel each other outexactlydown to the Busy-Beaverth decimal place, and continue to cancel regardless of what random bits of evidence and associated updates we run into throughout life. This presumes properties of both reality and the utility function which are very unlikely.)Note that this actually happens without allowing that the mugger has a finite probability of possessing a hypercomputer - it follows just from trying to assign

non-zeroprobability to the mugger possessing any Turing machine. It should also be noted that assigning probability literally zero means you will never believe the mugger can do somethingregardless of what evidencethey show you that they are Matrix Lords. Similarly if we use Hanson's anthropic solution, we will never believe the mugger can put us into a vastly special position without log(vast) amounts of evidence.Ouch, this rules out game-theoretic solutions.