Polymeron comments on Pascal's Mugging: Tiny Probabilities of Vast Utilities - Less Wrong

39 Post author: Eliezer_Yudkowsky 19 October 2007 11:37PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (334)

Sort By: Old

You are viewing a single comment's thread.

Comment author: Polymeron 04 January 2011 09:32:37AM *  5 points [-]

[Late edit: I have since retracted this solution as wrong, see comments below; left here for completeness. The ACTUAL solution that really works I've written in a different comment :) ]

I do believe I've solved this. Don't know if anyone is still reading or not after all this time, but here goes.

Eliezer speaks of the symmetry of Pascal's wager; I'm going to use something very similar here to solve the issue. The number of things that could happen next - say, in the next nanosecond - is infinite, or at the very least incalculable. A lot of mundane things could happen, or a lot of unforeseen things could happen. It could happen that a car would go through my living room and kill me. Or it could happen that the laws of energy conservation were violated and the whole world would turn into bleu cheese. Each of these possibilities could, in theory, have a probability assigned to it, given our priors.

But! We only have enough computing power to calculate a finite number of outcomes at any given moment. That means that we CANNOT go around assigning probabilities by calculation. Rather, we're going to need some heuristic to deal with all the probabilities we do NOT calculate.

Suppose our AI is very good at predicting things. It manages to assign SOME probability to what will happen next about 99% of the time (Note: My solution works equally well for anything from 0% to 100% minus epsilon - and I shouldn't have to explain why a Bayesian AI should never be 100% certain that it got an answer right). That means that 1% of the time, something REALLY surprises it; it just did not assign any probability at all. Now, because the number of things that could be in that category is infinite, they cancel out. Sure, we could all turn to cheese if it says "abracadabra". Or we could turn to cheese UNLESS it says so. The utility functions will always end in 0 for the uncalculated mass of probabilities.

That means that the AI always works under the assumptions that "or something I didn't see coming will happen; but I must be neutral regarding such an outcome until I know more about it".

Now. Say the AI manages to consider 1 million possibilities per prediction it makes (how it still gets 1% of them wrong is beyond me but again, the exact number doesn't matter for my solution). So any outcome that has NOT been calculated could, in fact, be considered to have a probability of 1%/ 1 million - not because there are only a million possibilities the AI hasn't considered, but because that is how many it could TRY to consider.

This number is your cutoff. Before you multiply a probability with a utility function, you subtract this number from the probability, first. So now if someone comes up to you and says it'll kill 3^^^^3 people and you decide to actually spend the cycles to consider how likely that is, and you get 1/googol, that number is LESS than the background noise of everything you don't have time to calculate. You round it down to zero, not because it is arbitrarily small enough, but because anything you have not considered for calculation must be considered to have higher probability - and like in Pascal's wager, those options' utility is infinite and can counter any number that Pascal's Mugger can throw at me. You subtract, not an arbitrary number, but rather a number depending on how long the AI is thinking about the problem; how many possibilities it takes into account.

Does this solve the problem? I think it does.

(By the way: ChrisA's way also works against this problem, except that coding your AI so that it may disregard value and morality if certain conditions are met seems like a pretty risky proposition).

Comment author: Will_Sawin 04 January 2011 04:23:39PM 1 point [-]

The problem is one of rational behavior, not of bounded-rational hacks.

Are you saying that it's a good thing that the AI uses this rounding system and goes against its values this particular time?

If so, how did you tell that it was a good thing?

Can you mathematically formalize that intuition?

If you cannot do so, there is probably some other conflict between your intuitions and your AI code.

Comment author: Polymeron 04 January 2011 05:17:26PM 1 point [-]

Actually, I think I made a mistake there.

Don't get me wrong, in my suggestion the AI is NOT going against its values nor being irrational, and this was not meant as a hack. Rather I'm claiming that the basic method of doing rationality as described needs revision that accounts for practicality, and if you disagree with that then your next rational move should DEFINITELY be to send me 50$ RIGHT NOW because I TOTALLY have a button that kicks 4^^^^4 puppies if I press it RIGHT HERE.

Having said that, I do think I might have made an error of intuition in there, so let's rethink it. Just because we should rethink what constitutes rational behavior does not mean I got it right.

Suppose I am an omnipotent being and have created a button that does something, once, if pressed. I truthfully tell you that there are several possible outcomes: 1. You receive 10$. This has a chance of 45% chance of happening. 2. You lose 5$. This, too, has a chance of 45% chance of happening. 3. Something else happens.

You should be pretty interested in what this "something else" might be before you press the button, since I've put absolutely no bounds on it. You could win 1000$. Or you could die. The whole world could die. You would wake up in a protein bath outside the Matrix. etc. etc. Some of these things you might be able to prepare for, if you know about them in advance.

If you're rational and you get no further information, you should probably press the button. The overall gain is 5$; as in Pascal's Wager, the infinity of possibilities that stem from the third option cancel each other out.

Now, suppose before I tell you that you get 10 guesses as to what the third thing is. Every time you guess, I tell you the precise probability that this thing is possible. Furthermore, the third option could do at least 12 different things, so no matter what you guessed, you would not be able to tell exactly what the button might do.

So you start guessing. One of your guesses is "3^^^^3 people will die horribly". I rate that one as a 10^-100 chance.

You've reached the end of the guesses and still a full 5% of probability remain - half of the third option's share.

So. Now do we press the button?

My claim was that the you should ignore every outcome smaller than 1% chance in this case, regardless of its utility. This now seems to me like a mistake. In theory, when we add the utility of all known options, it comes out extremely negative. Because the remaining 5% unknowns still have effectively zero chance of happening each, and they STILL cancel each other out.

I think I even know where my mathematical error was: I was assuming that anything less than 1% is a waste of a guess and therefore we should have guessed something else, which quite possibly has a higher chance - this establishes a cutoff for "a calculation that was not worth doing". However in this new example there are at least 12 things the button can do; essentially the number is infinite as far as I know. I should count myself VERY lucky to get 1% or more for anything I guess. In fact I should expect to get an answer of zero or epsilon for pretty much everything. That means that no guess is truly wasted or trivial.

Of course, if we don't press the button the Pascal Muggers will have won...

Back to the drawing board, I guess? :-/

Comment author: Will_Sawin 04 January 2011 05:56:44PM 2 points [-]

If the injured parties are humans, I should be very skeptical of the assertion because a very small fraction, (1/3^^3)*1/10^(something), of people have the power of life and death over 3^^^3 other people, whereas 1/10^(something smaller) hear the corresponding hoax.

That's the only answer that makes sense because it's the only answer that works on a scale of 3^^^3.

I think.

Comment author: Polymeron 04 January 2011 06:23:59PM 0 points [-]

"If the injured parties are humans, I should be very skeptical of the assertion because a very small fraction, (1/3^^3)*1/10^(something)"

You don't know that. In fact, you don't know that with some degree of uncertainty that, if I thought had a lot on the line, I might not take lightly.

I'm trying to think up several avenues. One is that the higher the claimed utility, the lower the probability (somehow); another tries to use the implications that accepting the claim would have on other probabilities in order to cancel it out.

I'll post a new comment if I manage to come up with anything good.

Comment author: Will_Sawin 04 January 2011 10:10:53PM 2 points [-]

I know because of anthropics. It is a logical impossibility for more than 1/3^^^3 individuals to have that power. You and I cannot both have power over the same thing, so the total amount of power is bounded, hopefully by the same population count we use to calculate anthropics.

Comment author: endoself 04 January 2011 10:41:14PM *  2 points [-]

Not in the least convenient possible world. What if someone told you that 3^^^3 copies of you were made before you must make your decision and that their behaviour was highly correlated as applies to UDT? What if the beings who would suffer had no consciousness, but would have moral worth as judged by you(r extrapolated self)? What if there was one being who was able to experience 3^^^3 times as much eudaimonia as everyone else? What if the self-indication assumption is right?

<troll> If you're going to engage in motivated cognition at least consider the least convenient possible world. </troll>

Comment author: Will_Sawin 05 January 2011 02:07:59AM 0 points [-]
  1. Am I talking to Omega now, or just some random guy? I don't understand what is being discussed. Please elaborate?

  2. Then my expected utility would not be defined. There would be relatively simple worlds with arbitrarily many of them. I honestly don't know what to do.

  3. Then my expected utility would not be defined. There would be relatively simple agents with arbitrarily sensitive utilities.

  4. Then I would certainly live in a world with infinitely many agents (or I would not live in any worlds with any probability), and the SIA would be meaningless.

My cognition is motivated by something else - by the desire to avoid infinities.

Comment author: endoself 05 January 2011 04:28:19AM *  0 points [-]

1) Sorry, I confused this with another problem; I meant some random guy.

2/3) Isn't how you decision process handles infinities rather important? Is there any corresponding theorem to the Von Neumann–Morgenstern utility theorem but without using either version of axiom 3? I have been meaning to look into this and depending on what I find I may do a top-level post about it. Have you heard of one?

edit: I found Fishburn, 1971, A Study of Lexicographic Expected Utility, Management Science. It's behind a paywall at http://www.jstor.org/pss/2629309. Can anyone find a non-paywall version or email it to me?

4) Yeah, my fourth one doesn't work. I really should have known better.

Sometimes, infinities must be made rigourous rather than eliminated. I feel that, in this case, it's worth a shot.

Comment author: Will_Sawin 05 January 2011 12:13:16PM 3 points [-]

What worries me about infinities is, I suppose, the infinite Pascal's mugging - whenever there's a single infinite broken symmetry, nothing that happens in any finite world matters to determine the outcome.

This implies that all are thought should be devoted to infinite rather than finite worlds. And if all worlds are infinite, it looks like we need to do some form of SSA dealing with utility again.

This is all very convenient and not very rigorous, I agree. I cannot see a better way, but I agree that we should look. I will use university library powers to read that article and send it to you, but not right now.