You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

Tetronian comments on Pascal's Mugging as an epistemic problem - Less Wrong Discussion

3 [deleted] 04 October 2010 05:52PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (37)

You are viewing a single comment's thread. Show more comments above.

Comment author: [deleted] 04 October 2010 07:38:44PM *  0 points [-]

In the dialog you give, Pascal assigns a probability that the mugger will fulfill his promise without hearing what that promise is, then fails to update it when the promise is revealed. But after hearing the number "1000 quadrillion", Pascal would then be justified in updating his probability to something less than 1 in 1000 quadrillion.

I think this might be it, but I'm not sure. Here is the key piece of the puzzle:

Mugger: Wow, you are pretty confident in your own ability to tell a liar from an honest man! But no matter. Let me also ask you, what’s your probability that I not only have magic powers but that I will also use them to deliver on any promise – however extravagantly generous it may seem – that I might make to you tonight?

Pascal: Well, if you really were an Operator from the Seventh Dimension as you assert, then I suppose it’s not such a stretch to suppose that you might also be right in this additional claim. So, I’d say one in 10 quadrillion.

This is why Pascal doesn't update based on the number 1000 quadrillion, because he has already stated his probability for "mugger has magic powers" x "given that mugger has magic, he will deliver on any promise", and this number is less then the utility the mugger claims he will deliver. So I suppose we could say that he isn't justified in doing so, but I don't know how well-supported that claim would be.

Other known defenses against Pascal's mugging are bounded utility functions, and rounding probabilities below some noise floor to zero. Another strategy that might be less likely to carry adverse side effects would be to combine a sub-linear utility function with a prior that assigns statements involving a number N probability at most 1/N (and the Occamian prior does indeed do this).

Yes, those would definitely work, since Bostrom is careful to exclude them in his essay. What I was looking for was more of a fully general solution.

Comment author: jimrandomh 04 October 2010 09:13:27PM 0 points [-]

Suppose you say that the probability that the mugger has magical powers, and will deliver on any promise he makes, is 1 in 10^30. But then, instead of promising you quadrillions of days of extra life, the mugger promises to do an easy card trick. What's your estimate of the probability that he'll deliver? (It should be much closer to 0.8 than to 10^-30).

That's because the statement "the mugger will deliver on any promise he makes" carries with it an implied probability distribution over possible promises. If he promises to do a card trick, the probability that he delivers on it is very high; if he promises to deliver quadrillions of years of life, it's very low. When you made your initial probability estimate, you didn't know which promise he was going to make. After he reveals the details, you have new information, so you have to update your probability. And if that new information includes an astronomically large number, then your new probability estimate ought to be infinitesimally small in a way that cancels out that astronomically large number.

Comment author: Will_Newsome 04 October 2010 09:50:50PM 4 points [-]

And if that new information includes an astronomically large number, then your new probability estimate ought to be infinitesimally small in a way that cancels out that astronomically large number.

Er, can you prove that? It doesn't seem at all obvious to me that magic power improbability and magic power utility are directly proportional. Any given computation's optimization power isn't bounded in one to one correspondence by its Kolmogorov complexity as far as I can see, because that computation can still reach into other computations and flip sign bits that cause extremely widespread effects without being very complex itself. If you think there's even a small chance that you're in a computation susceptible to intervention by probable but powerful computations like that, then it's not obvious that the improbability and potential utility cancel out.

Comment author: Will_Newsome 06 October 2010 01:43:08AM 0 points [-]

Goddammit Less Wrong the above is a brilliant counterargument and no one realizes it. I hate all of you.

Comment author: jimrandomh 06 October 2010 03:05:18AM *  2 points [-]

Sorry for not responding earlier; I had to think about this a bit. Whether the presence of astronomically large numbers can make you vulnerable to Pascal's Mugging seems to be a property of the interaction between the method you use to assign probabilities from evidence, and your utility function. Call the probability-assignment method P(X), which takes a statement X and returns a probability; and the utility function U(X), which assigns a utility to something (such as the decision to pay the mugger) based on the assumption that X is true.

P and U are vulnerable to Pascal's Mugging if and only if you can construct sets of evidence X(n), which differ only by a single number n, such that for any utility value u, there exists n such that P(X(n))U(X(n)) > u.

Now, I really don't know of any reason apart from Pascal's Mugging why utility function-predictor pairs should have this property. But being vulnerable to Pascal's Mugging is such a serious flaw, I'm tempted to say that it's just a necessary requirement for mental stability, so any utility function and predictor which don't guarantee this when they're combined should be considered incompatible.

Comment author: [deleted] 06 October 2010 06:10:36AM 0 points [-]

But being vulnerable to Pascal's Mugging is such a serious flaw, I'm tempted to say that it's just a necessary requirement for mental stability, so any utility function and predictor which don't guarantee this when they're combined should be considered incompatible.

Is the wording of this correct? Did you mean to say that vulnerability to Pascal's mugging is a necessary requirement for mental stability or the opposite?

Comment author: jimrandomh 06 October 2010 01:59:56PM 1 point [-]

No, I meant to say that immunity to Pascal's mugging is required.

Comment author: Will_Newsome 06 October 2010 05:12:27AM 0 points [-]

I'm interpreting your stance as "the probability that your hypothesis matches the evidence is bounded by the utility it would give you if your hypothesis matched the evidence." Reductio ad absurdum: I am conscious. Tautologically true. Being conscious is to me worth a ton of utility. I should therefore disbelieve a tautology.

Comment author: Will_Newsome 06 October 2010 04:54:33AM *  0 points [-]

u is the integer returned by U for an input X? Just wanted to make sure; I'm crafting my response.

Edit: actually, I have no idea what determines u here, 'cuz if u is the int returned by U then your inequality is tautological. No?

Comment author: jimrandomh 06 October 2010 02:11:39PM *  1 point [-]

Hmm, apparently that wasn't as clearly expressed as I thought. Let's try that again. I said that a predictor P and utility function U are vulnerable to Pascal's mugging if

exists function X of type number => evidence-set
such that X(a) differs from X(b) only in that one number appearing literally, and
forall u exists n such that P(X(n))U(X(n)) > u

The last line is the delta-epsilon definition for limits diverging to infinity. It could be equivalently written as

lim[n->inf] P(X(n))U(X(n)) = inf

If that limit diverges to infinity, then you could scale the probability down arbitrarily far and the mugger will just give you a bigger n. But if it doesn't diverge that way, then there's a maximum amount of expected utility the mugger can offer you just by increasing n, and the only way to get around it would be to offer more evidence that wasn't in X(n).

(Note that while the limit can't diverge to infinity, it is not required to converge. For example, the Pebblesorter utility function, U(n pebbles) = if(isprime(n)) 1 else 0, does not converge when combined with the null predictor P(X)=0.5.)

(The reductio you gave in the other reply does not apply, because the high-utility statement you gave is not parameterized, so it can't diverge.)

Comment author: [deleted] 04 October 2010 09:49:03PM *  0 points [-]

That's because the statement "the mugger will deliver on any promise he makes" carries with it an implied probability distribution over possible promises.

Agreed, but that's not the whole picture. Let's break this down a slightly different way: we know that p(mugger has magic) is very small number, and as you point out p(mugger will deliver on any promise) is a distribution, not a number. But we aren't just dealing with p(mugger will deliver on any promise), we are dealing with the conditional probability of p(mugger will deliver on any promise|mugger has magic) times p(mugger has magic). Though this might be a distribution based on what exactly the mugger is promising, it is still different from p(mugger will deliver on any promise), and it might still allow for a Pascal's Mugging.

This is why the card trick example doesn't work: p(mugger performs card trick) is indeed very high, but what we are really dealing with is p(mugger performs card trick|mugger has magic) times p(mugger has magic), so our probability that he does a card trick using actual magic would be extremely low.