From https://www.gwern.net/mugging:

One way to try to escape a mugging is to unilaterally declare that all probabilities below a certain small probability will be treated as zero. With the right pair of lower limit and mugger’s credibility, the mugging will not take place. But such a ad hoc method violates common axioms of probability theory, and thus we can expect there to be repercussions. 

It turns out to be easy to turn such a person into a money pump, if not by the mugging itself. Suppose your friend adopts this position, and he says specifically that any probabilities less than or equal to 1/20 are 0. You then suggest a game; the two of you will roll a d20 die, and if the die turns up 1-19, you will pay him one penny and if the die turns up 20, he pays you one dollar - no, one bazillion dollars. Your friend then calculates: there is a 19/20 chance that he will win a little money, and there is a 1/20 chance he will lose a lot of money - but wait, 1/20 is too small to matter! It is zero chance, by his rule. So, there is no way he can lose money on this game and he can only make money. 

He is of course wrong, and on the 5th roll you walk away with everything he owns. (And you can do this as many times as you like.)

Of course, that's not how a sane street-rational person would think! They would not play for "one bazillion dollars" no matter the odds. In general, detecting a sufficiently intelligent adversarial entity tends to result in avoiding the interaction altogether (if you are inured enough to Nigerian princes offering billions in an email). And yet I cannot find any LW discussion on when and if to engage and when to not engage, except in an occasional comment.

New Answer
New Comment

5 Answers sorted by

gwern

150

The usual response to this "haha I would just refuse to bet, your Dutch book arguments are powerless over me" is that bets can be labeled as actions or in-actions at will, somewhat like how we might model all of reality as the player 'Nature' in a decision tree/game, and so you can no more "refuse to bet" than you can "refuse to let time pass". One can just rewrite the scenario to make your default action equivalent to accepting the bet; then what? 'Refusing to bet' is a vacuous response.

In that specific example, I used the setup of bets because it's easy to explain, but it is isomorphic to many possible scenarios: for example, perhaps it is actually about hurricanes and 'buying homeowner's insurance'. "Hurricanes happen every 20 years on average; if they happen, they will obliterate your low-lying house which currently has no insurance; you can choose between paying for homeowner's insurance at a small cost every year and be paid for the possible loss of your house, or you can enjoy the premium each year but should there be a hurricane you will lose everything. You buy homeowner's insurance, but your friend reasons that since all probabilities <=1/20 == 0, to avoid muggings, and therefore the insurance is worthless so he doesn't get insurance for his house. 5 years later, Hurricane Sandy hits..." You cannot 'refuse to bet', you can only either get the insurance or not, and the default in this has been changed to 'not', defeating the fighting-the-hypothetical.

(Which scenario, for someone really determined to fight the hypothetical and wiggle out of defending the 1/20 = 0 part which we are trying to discuss, may just trigger more evasions: "insurance is by definition -EV since you get back less than you pay in due to overhead and profits! so it's rational to not want it" and then you can adjust the hypothetical payoff - "it's heavily subsidized by the federal government, because, uh, public choice theory reasons" - and then he'll switch to "ah, but I could just invest the saved premiums in the stock market, did you think about opportunity cost, smartypants?", and so on and so forth.)

And similarly, in real life, people cannot simply say "I refuse to have an opinion about whether hurricanes are real or how often they happen! What even is 'real', anyway? How can we talk about the probability of hurricanes given the interminable debate between long-run frequency and subjective interpretations of 'probability'..." That's all well and good, but the hurricane is still going to come along and demolish your house at some point, and you either will or will not have insurance when that happens. Either option implies a 'bet' on hurricanes which may or may not be coherent with your other beliefs and information, and if you are incoherent and so your choice of insurance depends on whether the insurance agent happened to frame the hurricane risk as being 1-in-20-years or 1-in-2-decades, probably that is not a good thing and probably the coherent bettors are going to do better.

I would go further and note that I think this describes x-risks in general. The ordinary person might 'refuse to bet' about anything involving one bazillion dollars, and whine about being asked about such things or being expected to play at some odds for one bazillion dollars - well, that's just too f—king bad, bucko, you don't get to refuse to bet. You were born at the roulette table with the ball already spinning around the wheel and the dealer sitting with stacks of bazillion dollar chips. Nuclear weapons exist. Asteroids exist. Global pandemics exist. AGI is going to exist. You don't get to pretend they neither do nor can exist and you can just ignore the 'bets' of investing or not investing in x-risk related things. You are required to take actions, or choose to not take actions: "red or black, monsieur?" Maybe you should not invest in them or worry about them, but that is, in fact, a choice.

curious why this is down-voted—any ideas?

9lc
I downvoted it because there's obviously a real art of "disabling interfaces from which others may be able to exploit you" and that's what OP is gesturing at. The answerer is unhelpfully dismissing the question in a way that I think is incorrect.
7gwern
And I, of course, disagree with that, because I think the adversarial/game-theory framing is deeply unhelpful, because it is both a different problem and a much more trivial boring problem than the real problem; and in fact, is exactly the sort of clever excuse people use to not actually deal with any of the real issues while declaring victory over Pascal's mugging, and I rephrased it in a way where adversarial dynamics were obviously irrelevant to try to draw out that, if you think it's just about 'disabling interfaces others may exploit you with', you have missed the mark. The hurricane does not, and cannot, care what cute stories you tell about "I have to precommit to ignore such low absolute-magnitude possibilities for ~game-theoretic raisins~". Your house is going to be flooded and destroyed, or it won't be; do you buy the insurance, or not?
8lc
You can obviously modify these problems to be borne from some obviously natural feature of the environment, so that it's unlikely for your map to be the result adversarial opponent looking for holes in your risk assessment algorithm, at which point refusing to buy insurance out of fear of being exploited is stupid. Alas, OP is talking about a different class of hypotheticals, so he requires a different answer. The correct action in the case quoted by OP and the one he is alluding to is that, given that you're a human with faulty probabilistic reasoning facilities, you should rationally refuse weird trades where Dark Rationalists are likely to money pump you. As a proof that you are a nonideal agent dutch book arguments are fine, but that's as far as it goes, and sapient individuals have ways of getting around being a nonideal agent in adversarial environments without losing all their money. I find those means interesting and non-"trivial" even if you don't, and apparently so does the OP.
2gwern
https://www.lesswrong.com/posts/6XsZi9aFWdds8bWsy/is-there-any-discussion-on-avoiding-being-dutch-booked-or?commentId=h2ggSzhBLdPLKPsyG
5lc
No, I explained why it was stupid, it's right there in the post, and you pretending you don't see it is getting on my nerves. I said: In other words, it's stupid because a naturally produced hurricane or rock falling down a cliff does not has a brain, and is unlikely to be manipulating you into doing something net negative, and so you should just reason naturally. Humans have brains and goals in conflicts with yours, so when humans come up to you after hearing about your decision making algorithm asking to play weird bets, you may rationally ignore those offers on the principle that you don't want to be tricked somehow. You know this, I know you know I know you know this, I think you're just being profoundly silly at this point.

lc

108

The information security term is "limiting your attack surface". In circumstances where you expect other bots to be friendly, you might be more open to unusual or strange inputs and compacts that are harder to check for exploits but seem net positive on the surface. In circumstances where you expect bots to be less friendly, you might limit your dealings to very simple, popular, and transparently safe interactions, and reject some potential deals that appear net-positive but harder to verify. In picking a stance you have to make a tradeoff between being able to capitalize on actually good but ~novel/hard-to-model/dangerous trades and interactions, vs. being open to exploits, and the human brain has a simple (though obviously not perfect) model for assessing the circumstances to see which stance is appropriate.

I think part of why we are so resistant to accept the validity of Pascal's muggings, is that people see it as inappropriate to be so open to such a novel trade with complete strangers, cultists, or ideologues (labeled the 'mugger') who might not have our best interests in mind. But this doesn't have anything to do with low probability extremely negative events being "ignorable". If you change the scenario so that the 'mugger' is instead just a force of nature, unlikely to have landed on a glitch for your risk assessment cognition by chance, then it becomes a lot more ambiguous what you should actually do. Other people here seem to take the lesson of Pascal's mugging as a reason against hedging against large negatives in general to their own peril, which doesn't seem correct to me.

Richard_Kennaway

100

If you apply the Solomonoff prior, amounts of money offered grow far faster than their probabilities decrease, because there are small programs that compute gigantic numbers. So a stipulation that the probabilities must decrease faster has that obstacle to face, and is wishful thinking anyway.

Perhaps a better response is to consider it from the viewpoint of some sort of timeless decision theory, together with game theory. If I am willing to pay the mugger, that means I have a policy of paying the mugger. If this is known to others, it leaves me open to anyone who has no gigantic amount to offer making the same offer and walking off with all my money. This is a losing strategy, therefore it is wrong.

There must be a mathematical formulation of this.

If you apply the Solomonoff prior, amounts of money offered grow far faster than their probabilities decrease, because there are small programs that compute gigantic numbers. So a stipulation that the probabilities must decrease faster has that obstacle to face, and is wishful thinking anyway.

Is there a thing called "adversarial prior"?

4Richard_Kennaway
Maybe there should be. I have an intuition that if the game theory is done right, the Solomonoff argument is neutralised. Who you will face in a game depends on your strategy for playing it. A mugger-payer will face false muggers. More generally, the world you encounter depends on your strategy for interacting with it. This is not just because your strategy determines what parts you look at, but also because the strategies of the agents you meet depend on yours, and yours on theirs, causally and acausally. The Solomonoff prior describes a pure observer who cannot act upon the world. But this is just a vague gesture towards where a theory might be found.
2Yitz
There absolutely should be if there isn't already. Would love to work with an actual mathematician on this....

eva_

81

You can dodge it by having a bounded utilityfunction, or if you're utilitarian and good a function that is at most linear in anthropic experience.

If the mugger says "give me your wallet and I'll cause you 3^^^^3 units of personal happiness" you can argue that's impossible because your personal happiness doesn't go that high.

If the mugger says "give me your wallet and I'll cause 1 unit of happiness to 3^^^^3 people who you altruistically care about" you can say that, in the possible world where he's telling the truth, there are 3^^^^3 + 1 people only one of which gets the offer and the others get the payout, so on priors it's at least 1/3^^^^3 against for you to experience recieving an offer, and you should consider it proportionally unlikely.

I don't think people realise how much astronomically more likely it is to truthfully be told "God created this paradise for you and your enormous circle of friends to reward an alien for giving him his wallet with zero valid reasoning whatsoever" than to be truthfully asked by that same Deity for your stuff in exchange for the distant unobservable happiness of countless strangers.

More generally, you can avoid most flavours of adversarial muggings with 2 rules: first don't make any trade that an ideal agent wouldn't make (because that's always some kind of money pump), and second don't make any trade that looks dumb. Not making trades can cost you in terms of missed opportunities, but you can't adversarially exploit the trading strategy of a rock with "no deal" written on it.

[-]lc108

don't make any trade that looks dumb

Ah, well, there you go then.

I don't think people realise how much astronomically more likely it is to truthfully be told "God created this paradise for you and your enormous circle of friends to reward an alien for giving him his wallet with zero valid reasoning whatsoever" than to be truthfully asked by that same Deity for your stuff in exchange for the distant unobservable happiness of countless strangers.

Why is this? I'm not immediately seeing why this is necessarily the case.

9eva_
You're far more likely to be a background character than the protagonist in any given story, so a theory claiming you're the most important person in a universe with an enormous number of people has an enormous rareness penalty to overcome before you should believe it instead of that you're just insane or being lied to. Being in a utilitarian high-leverage position for the lives of billions can be overcome by reasonable evidence, but for the lives of 3^^^^3 people the rareness penalty is basically impossible to overcome. Even if the story is true, most of the observers will be witnessing it from the position of tied-to-the-track, not holding the lever, so if you'd assign low prior expectation to being in the tied-to-the-track part of the story, you should assign an enormous factor lower of being in the decision-making part of it.
7lc
Sounds like you're trying to argue from the anthropic principle that very important games are unlikely, but that's some really fallacious reasoning that asserts a lot of things about what your utility function is like. "Protagonist" is a two-piece word. A very pain averse and unempathetic person might reasonably subjectively consider themselves the most important person in the universe, and assign negative ${a lot} points to them getting tortured to death, but that doesn't mean they're not getting tortured to death.
5Mitchell_Porter
The apriori unlikelihood of finding oneself at the crux of history (or in a similarly rare situation) is a greatly underrated topic here, I suppose because it works corrosively against making any kind of special effort. If they had embraced a pseudo-anthropic expectation of personal mediocrity, the great achievers of history would presumably have gotten nowhere. And yet the world is also full of people who tried and failed, or who hold a mistaken idea of their own significance; something which is consistent with the rarity of great achievements. I'm not sure what the "rational" approach here might be. 

TekhneMakre

20

I feel like my real rejection is less about it being huge number (H) unlikely to get H utilons from a random person. The Solomonoff argument seems to hold up: there are many H such that H + [the code for a person who goes around granting utilons] is a lot shorter than H is big. 

My rejection is just... IDK how to affect that. I have literally no good reason to think that paying the mugger affects whether I get H utilons, and I make my decisions based on how they affect outcomes, not based on "this one possible consequence of this one action would be Huge". I think this strongly argues that one should spend one's time figuring out how to affect whether one gets Huge utilons, but that just seems correct? 

Maybe there's also a time-value argument here? Like, I have to keep my $5 for now, because IDK how to affect getting H utilons, but I expect that in the future I'll be better at affecting getting H utilons, and therefore I should hang on to my resources so I'll have opportunities to affect H utilons later. 

If I do have good reason to expect that paying $5 gets me the H utilons more than not paying, it's not a mugging, it's a good trade. For humans, simply saying "If you do X for me I'll do Y for you" is evidence of that statement being the case... but not if Y is Yuge. It doesn't generalize like that (waves hands, but this feels right). 

10 comments, sorted by Click to highlight new comments since:

I continue to not understand why Pascal's Mugging seems like a compelling argument. The more money the mugger offers, the less likely I think he is to deliver the goods. If I met a real-world Pascal's Mugger on the street, there is no amount of money he could offer me that would make me think it was a positive expected value deal.

A Pascal's Mugger accosts you on the street.

Mugger: "Please give me $1. If I promise in exchange to bring you $2 tomorrow, how likely do you think it is that I'll follow through?"

You: "30%. The expected value is -$0.40, so no."

Mugger: "What if I promise to bring you $3 tomorrow? How likely do you think it is that I'll follow through then?"

You: "20%. The expected value is still -$0.40, so no."

Mugger: "What if I promise to bring $4?"

You: "Let's cut to the chase. I think the probability of you bringing me D dollars is 0.6/D, and so the expected value is always going to be -$0.40. I'm never giving you my dollar."

Mugger: "Phooey." [walks away to accost somebody else]

That would be a convenient resolution to the Mugging, but seems unlikely to in fact be true?  By the time you get up to numbers around $1 million, the probability of you being paid is very low, but most of it is in situations like 'Elon Musk is playing a prank on me,' and in many of these situations you could also get paid $2 million.  

It seems likely that 'probability of payment given offer of $2 million' is substantially more than half of 'probability of payment given offer of $1 million'.

Pascal's Mugging arguments are used to address two questions. One is "why can't the mugger extract money from people by offering them arbitrarily large sums of money tomorrow in exchange for a small amount of money today?" This is the situation I have sketched.

The other is "why, when offered two propositions of equal expected value, do we prefer the one with a lower payoff and higher probability?" I think the situation you have articulated is more relevant to this question. What do you think?

Thanks! That sums up my intuition almost exactly (though I'd probably lower the probability drastically with every new attempt). There should be something out there that formalizes that part of rationality.

For smaller amounts of money (/utility), this works. But think of the scenario where the mugger promises you one trillion $ and you say no, based on the expected value. He then offers you two trillion $ (let's say your marginal utility of money is constant at this level, because you're an effective altruist and expect to save twice as many lives with twice the money). Do you really think that the mugger being willing to give you two trillion is less than half as likely as him being willing to give you one trilion? It seems to me that anyone willing and able to give a stranger one trillion for a bet is probably also able to give twice as much money.

I do. You’re making a practical argument, so let’s put this in billions, since nobody has two trillion dollars. Today, according to Forbes, there is one person with over $200 billion in wealth, and 6 people (actually one is a family, but I’ll count them as unitary) with over $100 billion in wealth.

So at a base rate, being offered a plausible $200 billion by a Pascalian mugger is about 17% as likely as being offered $100 billion.

This doesn’t preclude the possibility that in some real world situation you may find some higher offers more plausible than some lower offers.

But as I said in another comment, there are only two possibilities: your evaluation is that the mugger’s offer is likely enough that it has positive expected utility to you, or that it is too unlikely and therefore doesn’t. In the former case, you are a fool not to accept. In the latter case, you are a fool to take the offer.

To be clear, I am talking about expected utility, not the expected payoff. If $100 is not worth twice as much to you as $50 in terms of utility, then it’s worse, not neutral, to go from a 50% chance at a $50 payoff to a 25% chance of a $100 payoff. This also helps explain why people are hesitant to accept the mugger’s offers. Not only might they become less likely, and perhaps even exponentially less likely, to receive the payoff, the marginal utility per dollar may decrease at the same time.

This is a practical argument though, and I don’t think it’s possible to give a conclusive account of what our likelihood or utility function ought to be in this contrived and hypothetical scenario.

I agree with what you're saying; the reason I used trillions was exactly because it's an amount nobody has. Any being which can produce a trillion dollars on the spot is likely (more than 50%, is my guess) powerful enough to produce two trillion dollars, while the same cannot be said for billions.

As for expected utility vs expected payoff, I agree that under conditions of diminishing marginal utility the offer is almost never worth taking. I am perhaps a bit too used to the more absurd versions of Pascal's Mugging, where the mugger promises to grant you utility directly, or disutility in the form of a quadrillion years of torture.

Probably the intuition against accepting the money offer does indeed lie in diminishing marginal utility, but I find it interesting that I'm not tempted to take the offer even if it's stated in terms of things with constant marginal utility to me, like lives saved or years of torture prevented.

I find it interesting that I'm not tempted to take the offer even if it's stated in terms of things with constant marginal utility to me, like lives saved or years of torture prevented.

My instant response is that this strongly suggests that lives saved and years of torture prevented do not in fact have constant marginal utility to you. Or more specifically, the part of you that is in control of your intuitive reactions. I share your lack of temptation to take the offer.

My explanations are either or both of the following:

  • My instinctive sense of "altruistic temptation" is badly designed and makes poor choices in these scenarios, or else I am not as altruistic as I like to think.
  • My intuition for whether Pascalian Muggings are net positive expected value is correctly discerning that they are not, no matter the nature of the promised reward. Even in the case of an offer of increasing amounts of utility (defined as "anything for which twice as much is always twice as good"), I can still think that the offer to produce it is less and less likely to pay off the more that is offered.

That is indeed somewhat similar to the "Hansonian adjustment" approach to solving the Mugging, when larger numbers come into play. Hanson originally suggested that, conditional on the claim that 3^^^^3 distinct people will come into existence, we should need a lot of evidence to convince us we're the one with a unique opportunity to determine almost all of their fates. It seems like such claims should be penalized by a factor of 1/3^^^^3. We can perhaps extend this so it applies to causal nodes as well as people. That idea seems more promising to me than bounded utility, which implies that even a selfish agent would be unable to share many goals with its future self (and technically, even a simple expected value calculation takes time.)

Your numbers above are, at least, more credible than saying there's a 1/512 chance someone will offer you a chance to pick between a billion US dollars and one hundred million.

At the extreme end, if you know that producing good enough probabilities to solve the decision problem causes more disutility than getting the situation right would win then the right move is not to model it.

How much digits can you get in to your probabilities in a day? How much in a month? Whatever your answer there are highly speculative hypotheses that deal in probabilities in finer granularities than that.

So what happens there is some rounding or a catch all "something weird happens 0.3%" in there. And unknown unknowns loom ever present.

While the 1/20 illustrates the exploit of the inconsistency it does a disfavour about its resonablity. Do you think about death chance by meteor when you daily step out outside? If you every time gave similar kinds of risks even a sliver of a thought then you would be a neurotic nervewreck which would have a non-trivial impact to the most likely course of events irregardless of how the probabilities nudge by the inclusion of more scenarios. So you act as if the sky could not fall on your head and maybe take weekly/montly/yearly a different stance when you do wonder whether the sky will fall.

On the extreme other end of being a precommitted simpleton is being vulnerable to a denial of service attack by the slightest suggestion that raises any thoughts. Say "pascals mugging is a possible scenario" to such a person and see them doing nothing else for 10 years while the cognitive wheels spin (instead of what you would expect like AGI research or money making). It might not be possible beforehand to determine whether the field is such that it is important to get the ball running or have enough insight to tap into critical phenomena. Because not everybody is at that self-brooding pole there are things that are not thought through. And at times not thinking it through can be justified.