# Pascal's Muggle (short version)

**Shortened version of**: Pascal's Muggle: Infinitesimal Priors and Strong Evidence

One proposal which has been floated for dealing with Pascal's Mugger is to penalize hypotheses that let you affect a large number of people, in proportion to the number of people affected - what we could call perhaps a "leverage penalty" instead of a "complexity penalty". This isn't just for Pascal's Mugger in particularly, it seems required to have expected utilities in general converge when the 'size' of scenarios can grow much faster than their algorithmic complexity.

Unfortunately this potentially leads us into a different problem, that of *Pascal's Muggle.*

Suppose a poorly-dressed street person asks you for five dollars in exchange for doing a googolplex's worth of good using his Matrix Lord powers - say, saving the lives of a googolplex other people inside computer simulations they're running.

"Well," you reply, "I think that it would be very improbable that I would be able to affect so many people through my own, personal actions - who am I to have such a great impact upon events? Indeed, I think the probability is somewhere around one over googolplex, maybe a bit less. So no, I won't pay five dollars - it is unthinkably improbable that I could do so much good!"

"I see," says the Mugger.

A wind begins to blow about the alley, whipping the Mugger's loose clothes about him as they shift from ill-fitting shirt and jeans into robes of infinite blackness, within whose depths tiny galaxies and stranger things seem to twinkle. In the sky above, a gap edged by blue fire opens with a horrendous tearing sound - you can hear people on the nearby street yelling in sudden shock and terror, implying that they can see it too - and displays the image of the Mugger himself, wearing the same robes that now adorn his body, seated before a keyboard and a monitor.

"That's not actually me," the Mugger says, "just a conceptual representation, but I don't want to drive you insane. Now give me those five dollars, and I'll save a googolplex lives, just as promised. It's easy enough for me, given the computing power my home universe offers. As for why I'm doing this, there's an ancient debate in philosophy among my people - something about how we ought to sum our expected utilities - and I mean to use the video of this event to make a point at the next decision theory conference I attend. Now will you give me the five dollars, or not?"

"Mm... no," you reply.

"*No?*" says the Mugger. "I understood earlier when you didn't want to give a random street person five dollars based on a wild story with no evidence behind it whatsoever. But surely I've offered you evidence now."

"Unfortunately, you haven't offered me *enough* evidence," you explain.

"Seriously?" says the Mugger. "I've opened up a fiery portal in the sky, and that's not enough to persuade you? What do I have to do, then? Rearrange the planets in your solar system, and wait for the observatories to confirm the fact? I suppose I could also explain the true laws of physics in the higher universe in more detail, and let you play around a bit with the computer program that encodes all the universes containing the googolplex people I would save if you just gave me the damn five dollars -"

"Sorry," you say, shaking your head firmly, "there's just no *way* you can convince me that I'm in a position to affect a googolplex people, because the prior probability of that is one over googolplex. If you wanted to convince me of some fact of merely 2^{-100 }prior probability, a mere decillion to one - like that a coin would come up heads and tails in some particular pattern of a hundred coinflips - then you could just show me 100 bits of evidence, which is within easy reach of my brain's sensory bandwidth. I mean, you could just flip the coin a hundred times, and my eyes, which send my brain a hundred megabits a second or so - though that gets processed down to one megabit or so by the time it goes through the lateral geniculate nucleus - would easily give me enough data to conclude that this decillion-to-one possibility was true. But to conclude something whose prior probability is on the order of one over googolplex, I need on the order of a googol bits of evidence, and you can't present me with a sensory experience containing a googol bits. Indeed, you can't *ever* present a mortal like me with evidence that has a likelihood ratio of a googolplex to one - evidence I'm a googolplex times more likely to encounter if the hypothesis is true, than if it's false - because the chance of all my neurons spontaneously rearranging themselves to fake the same evidence would always be higher than one over googolplex. You know the old saying about how once you assign something probability one, or probability zero, you can't update that probability regardless of what evidence you see? Well, odds of a googolplex to one, or one to a googolplex, work pretty much the same way."

"So no matter what evidence I show you," the Mugger says - as the blue fire goes on crackling in the torn sky above, and screams and desperate prayers continue from the street beyond - "you can't ever notice that you're in a position to help a googolplex people."

"Right!" you say. "I can believe that you're a Matrix Lord. I mean, I'm not a *total* Muggle, I'm psychologically capable of responding in *some* fashion to that giant hole in the sky. But it's just completely forbidden for me to assign any significant probability whatsoever that you will actually save a googolplex people after I give you five dollars. You're lying, and I am absolutely, absolutely, absolutely confident of that."

"So you weren't *just* invoking the leverage penalty as a plausible-sounding way of getting out of paying me the five dollars earlier," the Mugger says thoughtfully. "I mean, I'd understand if that was just a rationalization of your discomfort at forking over five dollars for what seemed like a tiny probability, when I hadn't done my duty to present you with a corresponding amount of evidence before demanding payment. But you... you're acting like an AI would if it was actually programmed with a leverage penalty on hypotheses!"

"Exactly," you say. "I'm forbidden *a priori* to believe I can ever do that much good."

"Why?" the Mugger says curiously. "I mean, all I have to do is press this button here and a googolplex lives will be saved." The figure within the blazing portal above points to a green button on the console before it.

"Like I said," you explain again, "the prior probability is just too infinitesimal for the massive evidence you're showing me to overcome it -"

The Mugger shrugs, and vanishes in a puff of purple mist.

The portal in the sky above closes, taking with the console and the green button.

(The screams go on from the street outside.)

A few days later, you're sitting in your office at the physics institute where you work, when one of your colleagues bursts in through your door, seeming highly excited. "I've got it!" she cries. "I've figured out that whole dark energy thing! Look, these simple equations retrodict it exactly, there's no way that could be a coincidence!"

At first you're also excited, but as you pore over the equations, your face configures itself into a frown. "No..." you say slowly. "These equations may look extremely simple so far as computational complexity goes - and they do exactly fit the petabytes of evidence our telescopes have gathered so far - but I'm afraid they're far too improbable to ever believe."

"What?" she says. "Why?"

"Well," you say reasonably, "if these equations are actually true, then our descendants will be able to exploit dark energy to do computations, and according to my back-of-the-envelope calculations here, we'd be able to create around a googolplex people that way. But that would mean that we, here on Earth, are in a position to affect a googolplex people - since, if we blow ourselves up via a nanotechnological war or *(cough) *make certain other errors, those googolplex people will never come into existence. The prior probability of us being in a position to impact a googolplex people is on the order of one over googolplex, so your equations must be wrong."

"Hmm..." she says. "I hadn't thought of that. But what if these equations are right, and yet somehow, everything I do is exactly balanced, down to the googolth decimal point or so, with respect to how it impacts the chance of modern-day Earth participating in a chain of events that leads to creating an intergalactic civilization?"

"How would *that* work?" you say. "There's only seven billion people on today's Earth - there's probably been only a hundred billion people who ever existed total, or will exist before we go through the intelligence explosion or whatever - so even before analyzing your exact position, it seems like your leverage on future affairs couldn't reasonably be less than one in a ten trillion part of the future or so."

"But then given this physical theory which seems obviously true, my acts might imply expected utility differentials on the order of 10^{10100}^{-13}," she explains, "and I'm not allowed to believe that no matter how much evidence you show me."

This problem may not be as bad as it looks; a leverage penalty may lead to more reasonable behavior than depicted above, after taking into account Bayesian updating:

Mugger: "Give me five dollars, and I'll save 3↑↑↑3 lives using my Matrix Powers."

You: "Nope."

Mugger: "Why not? It's a really large impact."

You: "Yes, and I assign a probability on the order of 1 in 3↑↑↑3 that I would be in a unique position to affect 3↑↑↑3 people."

Mugger: "Oh, is that really the probability that you assign? Behold!"

*(A gap opens in the sky, edged with blue fire.)*

Mugger: "Now what do you think, eh?"

You: "Well... I can't actually say this has a likelihood ratio of 3↑↑↑3 to 1. No stream of evidence that can enter a human brain over the course of a century is ever going to have a likelihood ratio larger than, say, 10^{1026} to 1 at the *absurdly most, *assuming one megabit per second of sensory data, for a century, each bit of which has at least a 1-in-a-trillion error probability. You'd probably start to be dominated by Boltzmann brains or other exotic minds well before then."

Mugger: "So you're not convinced."

You: "Indeed not. The probability that you're telling the truth is so tiny that God couldn't find it with an electron microscope. Here's the five dollars."

Mugger: "Done! You've saved 3↑↑↑3 lives! Congratulations, you're never going to top that, your peak life accomplishment will now always lie in your past. But why'd you give me the five dollars if you think I'm lying?"

You: "Well, because the evidence you *did* present me with had a likelihood ratio of at least a billion to one - I would've assigned less than 10^{-9} prior probability of seeing this when I woke up this morning - so in accordance with Bayes's Theorem I promoted the probability from 1/3↑↑↑3 to at least 10^{9}/3↑↑↑3, which when multiplied by an impact of 3↑↑↑3, yields an expected value of at least a billion lives saved for giving you five dollars."

I confess that I find this line of reasoning a bit suspicious - it seems overly clever - but at least on the level of intuitive-virtues-of-rationality it doesn't seem completely stupid in the same way as Pascal's Muggle. This muggee is at least *behaviorally *reacting to the evidence. In fact, they're reacting in a way *exactly* proportional to the evidence - they would've assigned the same net importance to handing over the five dollars if the Mugger had offered 3↑↑↑4 lives, so long as the strength of the evidence seemed the same.

But I still feel a bit nervous about the idea that Pascal's Muggee, after the sky splits open, is handing over five dollars while claiming to assign probability on the order of 10^{9}/3↑↑↑3 that it's doing any good. My own reaction would probably be more like this:

Mugger: "Give me five dollars, and I'll save 3↑↑↑3 lives using my Matrix Powers."

Me: "Nope."

Mugger: "So then, you think the probability I'm telling the truth is on the order of 1/3↑↑↑3?"

Me: "Yeah... that probably has to follow. I don't see any way around that revealed belief, given that I'm not actually giving you the five dollars. I've heard some people try to claim silly things like, the probability that you're telling the truth is counterbalanced by the probability that you'll kill 3↑↑↑3 people instead, or something else with a conveniently exactly equal and opposite utility. But there's no way that things would balance out that neatly in practice, if there was no *a priori* mathematical requirement that they balance. Even if the prior probability of your saving 3↑↑↑3 people and killing 3↑↑↑3 people, conditional on my giving you five dollars, *exactly *balanced down to the log(3↑↑↑3) decimal place, the likelihood ratio for your telling me that you would "save" 3↑↑↑3 people would not be exactly 1:1 for the two hypotheses down to the log(3↑↑↑3) decimal place. So if I assigned probabilities much greater than 1/3↑↑↑3 to your doing something that affected 3↑↑↑3 people, my actions would be overwhelmingly dominated by even a tiny difference in likelihood ratio elevating the probability that you saved 3↑↑↑3 people over the probability that you did something equally and oppositely bad to them. The only way this hypothesis can't dominate my actions - really, the only way my expected utility sums can converge at all - is if I assign probability on the order of 1/3↑↑↑3 or less. I don't see any way of escaping that part."

Mugger: "But can you, in your mortal uncertainty, truly assign a probability as low as 1 in 3↑↑↑3 to any proposition whatever? Can you truly believe, with your error-prone neural brain, that you could make 3↑↑↑3 statements *of any kind *one after another, and be wrong, on average, about once?"

Me: "Nope."

Mugger: "So give me five dollars!"

Me: "Nope."

Mugger: "Why not?"

Me: "Because even though I, in my mortal uncertainty, will eventually be wrong about all sorts of things if I make enough statements one after another, this fact can't be used to increase the probability of arbitrary statements beyond what my prior says they should be, because then my prior would sum to more than 1. There must be some kind of required condition for taking a hypothesis seriously enough to worry that I might be overconfident about it -"

Mugger: "Then behold!"

*(A gap opens in the sky, edged with blue fire.)*

Mugger: "Now what do you think, eh?"

Me *(staring up at the sky):* "...whoa." *(Pause.)* "You turned into a cat."

Mugger: "What?"

Me: "Private joke. Okay, I think I'm going to have to rethink a *lot *of things. But if you want to tell me about how I was wrong to assign a prior probability on the order of 1/3↑↑↑3 to your scenario, I will shut up and listen very carefully to what you have to say about it. Oh, and here's the five dollars, can I pay an extra twenty and make some other requests?"

*(The thought bubble pops, and we return to two people standing in an alley, the sky above perfectly normal.)*

Mugger: "Now, in this scenario we've just imagined, you were taking my case seriously, right? But the evidence there couldn't have had a likelihood ratio of more than 10^{1026} to 1, and probably much less. So by the method of imaginary updates, you must assign probability at least 10^{-1026} to my scenario, which when multiplied by a benefit on the order of 3↑↑↑3, yields an unimaginable bonanza in exchange for just five dollars -"

Me: "Nope."

Mugger: "How can you possibly say that? You're not being logically coherent!"

Me: "I agree that I'm being incoherent in a sense, but I think that's acceptable in this case, since I don't have infinite computing power. In the scenario you're asking me to imagine, you're presenting me with evidence which I currently think Can't Happen. And if that actually *does* happen, the sensible way for me to react is by questioning my prior assumptions and reasoning which led me to believe I shouldn't see it happen. One way that I handle my lack of logical omniscience - my finite, error-prone reasoning capabilities - is by being willing to assign infinitesimal probabilities to non-privileged hypotheses so that my prior over all possibilities can sum to 1. But if I actually see strong evidence for something I previously thought was super-improbable, I don't just do a Bayesian update, I should also question whether I was right to assign such a tiny probability in the first place - whether the scenario was really as complex, or unnatural, as I thought. In real life, you are not ever supposed to have a prior improbability of 10^{-100} for some fact distinguished enough to be written down, and yet encounter strong evidence, say 10^{10} to 1, that the thing has actually happened. If something like that happens, you don't do a Bayesian update to a posterior of 10^{-90}. Instead you question both whether the evidence might be weaker than it seems, *and* whether your estimate of prior improbability might have been poorly calibrated, because rational agents who actually have well-calibrated priors should not encounter situations like that until they are ten billion days old. Now, this may mean that I end up doing some non-Bayesian updates: I say some hypothesis has a prior probability of a quadrillion to one, you show me evidence with a likelihood ratio of a billion to one, and I say 'Guess I was wrong about that quadrillion to one thing' rather than being a Muggle about it. And then I shut up and listen to what *you* have to say about how to estimate probabilities, because on my worldview, I wasn't *expecting* to see you turn into a cat. But for me to make a super-update like that - reflecting a posterior belief that I was logically incorrect about the prior probability - you have to really actually show me the evidence, you can't just ask me to imagine it. This is something that only logically incoherent agents ever say, but that's all right because I'm not logically omniscient."

When I add up a complexity penalty, a leverage penalty, and the "You turned into a cat!" logical non-omniscience clause, I get the best candidate I have so far for the correct decision-theoretic way to handle these sorts of possibilities while still having expected utilities converge.

As mentioned in the longer version, this has very little in the way of relevance for optimal philanthropy, because we don't really need to consider these sorts of rules for handling small large numbers on the order of a universe containing 10^{80} atoms, and because most of the improbable leverage associated with x-risk charities is associated with discovering yourself to be an Ancient Earthling from before the intelligence explosion, which improbability (for universes the size of 10^{80} atoms) is easily overcome by the sensory experiences which tell you you're an Earthling. For more on this see the original long-form post. The main FAI issue at stake is what sort of prior to program into an AI.

## Comments (55)

BestShouldn't it be "asks you for five dollars"?

I'd be even more wary of Pascal's gifting than his mugging.

Hmm, I wounder if there is a version of Pascal gifting worth pondering. "The anti-mugger says 'I will pay you $5 if..'" what?

Fixed.

Since a human mind really can't naturally conceive of the difference between huge numbers like these, wouldn't it follow that our utility functions are bounded by an horizontal asymptote? And shouldn't that solve this problem?

I mean, if the amount of utility gained from saving x amount of people is no longer allowed to increase boundlessly, you don't need such improbable leverage penalties. You'd still of course have the property that it's better to save more people, just not linearly better.

I find that unsatisfactory for the following reasons - first, I am a great believer in life and love without bound; second, I suspect that the number of people in the multiverse is already great enough to max out that sort of asymptote and yet I still care; third, if this number is not already maxed out, I find it counterintuitive that someone another universe over could cause me to experience preference reversals in this universe by manipulating the number of people who already exist inside a box.

*2 points [-]Ok, I might have formulated myself badly. My argument is that any agent of bounded computational power is forced to use two utility functions. The one they wish they had (i.e. the unbounded linear version) and the one they are forced to use in their calculations because of their limitations (i.e. an asymptotically bounded approximation).

For those agents capable of self-modification, just add a clause to increase their computational power (and thereby increasing the bound of their approximation) whenever the utilities of the "scales they're working on" differ by more than some small specified number.

So, my answer to this person would be "stick around until I can safely modify myself into dealing with your request", or alternatively, if he wants an answer right now after seeing his evidence, "here's 5 dollars".

Why can't you increase your asymptote with new evidence? If, for instance, your utility was bounded at 2^160 utilons before the mugger opened the sky then just increase your bound according to that evidence and then shut up and multiply to decide whether to pay $5. You can't update to a bound of 3^^^3 in one step since you can't receive enough evidence at once, which is a handy feature for avoiding muggings, but your utility at a distant point in the future is essentially unbounded given enough evidential updates over time.

Useful utility bounds should be derivable from our knowledge of the universe. If we can theoretically create 10^80 unique, just-worth-living lives with the estimated matter and energy in the universe then that provides a minimum bound, although it's probably desirable to choose the bound large enough that the 10^80th life is worth nearly as much as the 1st or 10^11th life. When we have evidence for a change in our estimate of the available matter and energy or a change in the efficiency of turning matter and energy into utility we scale the bound appropriately.

*4 points [-]I think having a leverage penalty, to the extent that such a thing is reasonable, is more about correcting for biases in how humans generate hypotheses than anything else. For example, many mental patients claim to be important historical figures like Jesus who had a large amount of leverage. It's not completely unreasonable to guess that this comes from some psychological mechanism that might still be in place in non-mental patients, and so it's not completely unreasonable to discard claims of large amounts of leverage in the name of combating this particular way in which humans might privilege hypotheses.

I wouldn't go so far as to claim that the size of the penalty should have any obvious relationship to the size of the leverage claimed, though.

Then this is an entirely different factor which doesn't have to do with having our EU sums converge.

Five bucks is easy to give. How much would you be willing to give, maximum, given this evidence? $500? $5000? All you have or can borrow? Everything your real or potential descendants will earn during their lifetimes? In how many generations?

When the evidence is a huge hole opening in the sky? I think common sense allows you to take fairly large actions after seeing a huge hole opening in the sky. I mean, in practice, I would tend to try and take certain actions intended to do something about the rather high posterior probability that I was hallucinating and be particularly wary of actions that sound like the sort of thing psychotic patients hallucinate, but this is an artifact of the odd construction of the scenario and wouldn't apply to the more realistic and likely-to-be-actually-encountered case of the physics theory which implied we could use dark energy for computation or whatever.

Just wondering how you would go about estimating the "optimal" amount to give in such a situation. After all, it's a fairly common occurrence, someone claiming a scientific breakthrough providing free energy and showing an apparent working perpetual motion machine, or a new herbal medicine which cures cancer and presenting dozens of convincing testimonies.

If you really, honestly can't tell the difference between that and an xrisk reduction charity, you're probably not reading LW in the first place. Or if you mean something else by this question, could you ask with a different example instead?

*3 points [-]Sorry, I'm just an amateur asking what is probably a stupid question. Given the leverage N and the prior probability 1/n, where N>>n>>1, what is the optimal investment amount, in whichever units you prefer? Say, in percent of the resources available to you.

I can't offhand see how to translate the given numbers into a Kelly betting criterion. My own heuristic is something more along the lines of "Find the best thing that looks like it might actually work and do it." Things that won't actually work are not done anyway even if the current state of the search calls them "best", but I try to avoid Enrico-Fermi-style "ten percent" underestimates about what might actually have an impact. No holes have actually appeared in the sky, and I'm not presently working with any N larger than 10^80, and there's no reason to worry about small probabilities of affecting that when it's easy to find several different medium-sized candidates. I'd only want to complicate my reasoning any further if I ended up in a more complicated situation than that.

(I also don't think that perpetual motion machines have N>>n.)

But as gwern and other commenters showed, your characterization of accessible chain reactions as an easily drawn implication of then-known physics was wrong.

The best case for bias on Fermi's part in that post is the partially (but not very) independent claim that:

But you didn't give a citation or quote for that.

And you suggest that

But you only present evidence that they cared about a 10% probability (and Szilard was selected from history in part because of his right-tail estimate).

I'm still suspicious that Fermi could truly not have done better than "ten percent", and wonder if people are trying a little too hard not to give in to hindsight bias and overfitting, at the cost of failing to learn heuristics that could indeed generalize. Agreed that if +chain reaction implied a new fact of physics in the sense that it tells you about a previously unknown heavy element which emits free neutrons and is splittable, the standard heuristic "does the failure of this prediction tell us a new fact of physics" does not work in the vanilla sense. This doesn't mean that a fair estimate of the probability of at least one not-yet-examined element having the desired properties would have been ten percent. Chain reactions were not just barely possible for a large barely-critical fission plant 50 years later, rather they were soon achieved at a prompt supercritical grade adequate for nuclear explosions by two distinct pathways of U-235 refinement and P-239 breeding, both of which admittedly required effort, but was the putting-in of that effort unpredictable? But this should be continued in the other post rather than here.

*5 points [-]They could realistically only breed enough Pu239 by starting with U235 fueled reactor. Everything that you can do in 1945 goes through U235 , which we have only because it has unusually long half life (350x the next stablest fissile isotope, Np-237) . On top of that, they didn't even know that fission released prompt secondary neutrons at all - those could of simply remained in the fission products and convert to protons via beta decay.

I know they got a critical reaction with a big heap of unrefined uranium. This makes no mention of uranium needing to be isotopically refined for plutonium production on the Manhattan Project. As you are generally a great big troll, I am afraid I cannot trust anything you say about isotopic refinement having been used or required without further references, but I will not actually downvote yet in case you're not lying. Got a cite?

Depends on your opportunity costs.

I think both this and the longer version are excellent, but I wonder whether it would be better to have the long version in Discussion and the short version in Main rather than the other way around.

That was my first thought, but then I realized that I mainly want discussion on the longer version and that it's the longer version that has the FAI consequences, etc.

Yes, but which version do you want the higher-quality discussion on, and which one do you want more casual readers to see?

*1 point [-]Which one is which, and how can you tell?

Context: I read Main much more than Discussion, due to time constraints and my perception that Discussion is less interesting in bulk. I tend to comment rarely, and thus I pay attention to those comments. Which means that something in Main is more likely to have comments from users like me than something in Discussion.

I’m not sure exactly what you meant, but I thought you meant Main gets more “casual” users and Discussion gets better, well, discussions because of the very active (non-casual) users are more predominant there.

My impression differs with regard to the latter, as I'd expect the very active posters to be mostly everywhere and the high-quality lurkers (which are "knowledgeable insiders" but comment rarely, and provide diversity to a discussion) are mostly in Main. It's probably true that we get more “newbies” in Main, but maybe the voting works better at hiding them for me.

My probable reaction before reading this, now hopefully replaced by the OP if I can remember it:

Mugger: "Give me five dollars, and I'll save 3↑↑↑3 lives using my Matrix Powers."

Me: "Here you-- Wait a minute, I could give these to the MIRI, but I can't do that if I give them to you. There's, like, 1/1 000 000 000 probability you're speaking the truth, and the probability of them making the difference between a negative and positive singularity, wich'd make a difference for way more people like that all over the multiverse, maybe."

Mugger: "... you are

terribleat pulling numbers out of your ass. Are you even trying? Unimaginable amounts of lives are on the tine here!"Me: "I can't deny any of that. Bye!"

me continues never getting around to actually donating and spends the money of something frivolous*2 points [-]The utility function

isup for grabs.Utilities functions perfectly represent preferences that follow the VNM axioms. We work with utilities more often than preferences and can forget that preferences are the fundamental concept. Agents don't have preferences because they have utility functions -- agents have utility functions because they have preferences.

We use shortcuts that allow us to quickly manage small (conceivable) utilities and we understand the consequences intuitively. But shortcuts don't have to remain valid with superexponentially large utilities. Running superexponentially large utilities backwards through the VNM axioms result in preferences that border on insanity.

Assigning a utility of $2^googul to an outcome means preferring any possibility of its fruition over a certainty of $1,000,000 even if every bit of information in the observable universe unanimously tells you that you are wrong. This preference in the face of overwhelming counter-evidence is not a perverse artifact; it is the

intrinsic meaning of $2^googul. If that meaning truly represents your preferences, fine. If not,your utility function is wrong.Tiny probabilities can be sliced into conditional sequences of large probabilities, superexponential changes can be sliced into sequences of exponential changes. See e.g. the Lifespan Dilemma. If you reject the final conclusion, you must reject some particular step along the way, on pain of circular preference. This is the larger significance of VNM.

Also: You're not complaining about VNM per se, you're complaining that you think your preferences correspond to a bounded utility function.

*1 point [-]I read Lifespan Dilemma (which I had not before, thank you for pointing it out [and writing it]). I torture on specks vs. torture without so much as a second thought, but don't give the Omega a penny for life extension for this reason:

Following a super-exponential utility function requires super-polynomial amounts of memory. I just don't know what it would mean to have more experiences than can be counted using all the matter in the universe. Existing outside of PSPACE is so...alien.

If I wanted to keep count of how many experiences I've had, it would be morally impermissible to simply increment natural numbers. Some natural numbers encode universes of unspeakable torment, and I'd hate for my IP register to accidentally create a hell dimension.

Is it really so disheartening to believe that the marginal utility I'd place on the

Nth experience* would eventually decrease to 0?*Where

Nis a number that can encode a universe-timeline unfathomably larger than ours.ETA:Yes, I agree. I have no grief with the VNM axioms. I think that agents should use only a polynomial amount of memory, and therefore desire that they follow utility functions that grow at most exponentially against the complexity of outcomes.

"A polynomial amount of memory ought to be enough for anyone." -- solipsist

Isn't it also reasonable in the last case to reevaluate the relative utility of five dollars and of a googleplex lives?

Is it inconsistent in the first two cases that you would believe him (after he shows the extraordinary evidence) if instead of a googleplex lives, he offered to save more than 10?

You live in a matrix universe, and you also know that matrix lords occasionally come down to offer strange requests to individuals. One of these matrix lords asks for you to give him 5 dollars, or else he will press a red button that has a one in a googol chance of killing a googolplex people. The red button's randomness is not determined by quantum effects. Do you hand over the five dollars in that case?

Intuitively? Yes, of course I do. I don't trust that intuition too strongly, but this thought experiment does make me update a lot about the value of a lot of ideas about Pascal's mugging.

*1 point [-]I'm really confused. Like, really,

reallyconfused. Hopefully someone can illuminate this topic for me, because right now, I'm not seeing where this "leverage penalty" comes from. Complexity penalties are pretty obviously a consequence of formalizations of Occam's Razor, in particular Solomonoff Induction, but why does the idea of a "leverage penalty" evenexist? It seems like apost hocjustification tacked on in order to somehow deal with the original Pascal's Mugging situation. If I started from the basics of probability theory and computational theory, it seems conceivable to me that given enough time, I might be able to independently arrive at the idea of complexity penalties. It doesnot, on the other hand, seem likely that I would ever be able to derive this concept of a "leverage penalty" from first principles; it seems like a clever after-the-fact justification.I do realize, however, that the leverage penalty was proposed by a very smart person (Robin Hanson), and then later discussed by another very smart person (Eliezer), both of whom are much smarter than I am, so it is much more likely that I am the one confused here than that they are

actuallyengaging in after-the-fact rationalization. So my question right now is this:where do "leverage penalties" come from?Could someone take the time to humor an aspiring student of mathematics and explain? Thanks in advance!(Right now, I'm not sure where leverage penalties come from, but if they do come from somewhere, as opposed to being pulled out of thin air, my bet is on anthropics. If this is true, it wouldn't be surprising, because I find anthropics hellishly confusing most of the time, so it seems reasonable that I would be confused about a concept derived from that area.)

Anthropics would be one way of reading it, yes. Think of it as saying, in addition to wanting all of our Turing machines to add up to 1, we also want all of the computational elements inside our Turing machines to add up to 1 because we're trying to guess which computational element 'we' might be. This might seem badly motivated in the sense that we can only say "Because our probabilities have to add up to 1 for us to think!" rather than being able to explain why magical reality fluid ought to work that way a priori, but the justification for a simplicity prior isn't much different - we have to be able to add up all the Turing machines in their entirety to 1 in order to think. So Turing machines that use lots of tape get penalties to the probability of your being any

particularorspecialelement inside them. Being able to affect lots of other elements is a kind of specialness.I'm confused because I had always thought it would be the exact opposite. To predict your observational history given a description of the universe, solomonoff induction needs to find you in it. The more special you are, the easier you are to find and thus the easier it is to find your observational history.

Here's a point of consideration: if you take Kurzweil's solution, then you can avoid Pascal's mugging when you are an agent, and your utility function is defined over similar agents. However, this solution wouldn't work on, for example, a paperclip maximizer, which would still be vulnerable - anthropiic reasoning does not apply over paperclips.

While it might be useful to have Friendly-style AIs be more resilient to P-mugging than simple maximizers, it's not exactly satisfying as an epistemological device.

Isn't there a leverage penalty built into a Kolmogorov complexity prior if the bit string you're trying to generate is a particular agent's sense data? Because the more stuff there is, the more bits are required to locate that agent? And does this solve a problem with just using normal anthropics, where the leverage penalty doesn't help a paperclip maximizer deal with the potential universe that is just it and a bunch of paperclips that could be destroyed, because paperclips aren't anthropic reasoners?

In all examples but the physicist one, what I would believe is that I'm dreaming (counting being a Boltzmann brain or similar as ‘dreaming’). But then again, if I'm dreaming I don't actually lose anything by giving the mugger five dollars, so...

I would have posted the short version to Main and the long version to Discussion.

Version 2 is the way I had approached Pascal's Mugging until I read this post, and I like the logical uncertainty-based answer. But does this mean I'm not getting flimple utility?

I'll give it to you for five dollars.