St. Petersburg Mugging Implies You Have Bounded Utility

TimFreeman

This post describes an infinite gamble that, under some reasonable assumptions, will motivate people who act to maximize an unbounded utility function to send me all their money. In other words, if you understand this post and it doesn't motivate you to send me all your money, then you have a bounded utility function, or perhaps even upon reflection you are not choosing your actions to maximize expected utility, or perhaps you found a flaw in this post.

Briefly, we do this with The St. Petersburg Paradox, converted to a mugging along the lines of Pascal's Mugging. I then tweaked it to extract all of the money instead of just a fixed sum.

I have always wondered if any actual payments have resulted from Pascal's Mugging, so I intend to track payments received for this variation. If anyone does have unbounded utility and wants to prove me wrong by sending money, send it with Paypal to tim at fungible dot com. Annotate the transfer with the phrase "St. Petersburg Mugging", and I'll edit this article periodically to say how much money I received. In order to avoid confusing the experiment, and to exercise my spite, I promise I will not spend the money on anything you will find especially valuable. SIAI would be better charity, if you want to do charity, but don't send that money to me.

Here's the hypothetical (that is, false) offer to persons with unbounded utility:

Let's call your utility function "UTILITY". We assume it takes a state of the universe as an argument.
Define DUT to be UTILITY(the present situation plus you receiving $1000)-UTILITY(the present situation). Here DUT stands for Difference in UTility. We assume DUT is positive.
You have unbounded utility, so for each nonnegative N there is a universe UN(N) such that UTILITY(UN(N)) is at least DUT * 2**N. Here UN stands for "universe".
The phrase "I am a god" is defined to mean that I am able to change the universe to any state I choose. I may not be a god after I make the change.
The offer is: For every dollar you send me, I will flip a coin. If it comes out Tails, or I am not a god, I will do nothing. If it comes out Heads and I am a god, I will flip the coin repeatedly until I see it come up Heads again. Let T be the number of times it was Tails. I will then change the universe to UN(T).

If I am lying and the offer is real, and I am a god, what utility will you receive from sending me a dollar? Well, the probability of me seeing N Tails followed by a Head is (1/2)**(N + 1), and your utility for the resulting universe is UTILITY(UN(N)) >= DUT * 2**N, so your expected utility if I see N tails is (1/2)**(N + 1) * UTILITY(UN(N)) >= (1/2)**(N + 1) * DUT * 2 ** N = DUT/2. There are infinitely many possible values for N, so your total expected utility is positive infinity * DUT/2, which is positive infinity.

I hope we agree that it is unlikely that I am a god, but it's consistent with what you have observed so far, so unless you were born with certain knowledge that I am not a god, you have to assign positive probability to it. Similarly, the probability that I'm lying and the above offer is real is also positive. The product of two positive numbers is positive. Combining this with the result from the previous paragraph, your expected utility from sending me a dollar is infinitely positive.

If you send me one dollar, there will probably be no result. Perhaps I am a god, and the above offer is real, but I didn't do anything beyond flipping the first coin because it came out Tails. In that case, nothing happens. Your expected utility for the next dollar is also infinitely positive, so you should send the next dollar too. By induction you should send me all your dollars.

If you don't send money because you have bounded utility, that's my desired outcome. If you do feel motivated to send me money, well, I suppose I lost the argument. Remember to send all of it, and remember that you can always send me more later.

As of 7 June 2011, nobody has sent me any money for this.

ETA: Some interesting issues keep coming up. I'll put them here to decrease the redundancy:

Yes, you can justify not giving me money because I might be a god by claiming that there are lots of other unlikely gods that have a better claim on your resources. My purpose in writing this post is to find a good reason not to be jerked around by unlikely gods in general. Finding a reason to be jerked around by some other unlikely god is missing the point.
I forgot to mention that if I am a god, I can stop time while I flip coins, so we aren't resource-constrained on the number of times I can flip the coin.
Yes, you can say that your prior probability of me being a god is zero. If you want to go that way, can you say what that prior probability distribution looks like in general? I'm actually more worried about making a Friendly AI that gets jerked around by an unlikely god that we did not plan for, so having a special case about me being god doesn't solve an interesting portion of the problem. For what it's worth, I believe the Universal Prior would give positive small probability to many scenarios that have a god, since universes with a god are not incredibly much more complex than universes that don't have a god.

Briefly, we do this with The St. Petersburg Paradox, converted to a mugging along the lines of Pascal's Mugging. I then tweaked it to extract all of the money instead of just a fixed sum.

Here's the hypothetical (that is, false) offer to persons with unbounded utility:

Let's call your utility function "UTILITY". We assume it takes a state of the universe as an argument.
Define DUT to be UTILITY(the present situation plus you receiving $1000)-UTILITY(the present situation). Here DUT stands for Difference in UTility. We assume DUT is positive.
You have unbounded utility, so for each nonnegative N there is a universe UN(N) such that UTILITY(UN(N)) is at least DUT * 2**N. Here UN stands for "universe".
The phrase "I am a god" is defined to mean that I am able to change the universe to any state I choose. I may not be a god after I make the change.
The offer is: For every dollar you send me, I will flip a coin. If it comes out Tails, or I am not a god, I will do nothing. If it comes out Heads and I am a god, I will flip the coin repeatedly until I see it come up Heads again. Let T be the number of times it was Tails. I will then change the universe to UN(T).

As of 7 June 2011, nobody has sent me any money for this.

ETA: Some interesting issues keep coming up. I'll put them here to decrease the redundancy:

Yes, you can justify not giving me money because I might be a god by claiming that there are lots of other unlikely gods that have a better claim on your resources. My purpose in writing this post is to find a good reason not to be jerked around by unlikely gods in general. Finding a reason to be jerked around by some other unlikely god is missing the point.
I forgot to mention that if I am a god, I can stop time while I flip coins, so we aren't resource-constrained on the number of times I can flip the coin.
Yes, you can say that your prior probability of me being a god is zero. If you want to go that way, can you say what that prior probability distribution looks like in general? I'm actually more worried about making a Friendly AI that gets jerked around by an unlikely god that we did not plan for, so having a special case about me being god doesn't solve an interesting portion of the problem. For what it's worth, I believe the Universal Prior would give positive small probability to many scenarios that have a god, since universes with a god are not incredibly much more complex than universes that don't have a god.

When I thought about it, I realized this seemed very similar to a standard hack used on people that we already rely on computers to defend us against. To be specific, it follows an incredibly similar framework to one of those Lottery/Nigerian 419 Scam emails.

Opening Narrative: Attempt to establish some level of trust and believability. Things with details tend to be more believable than things without details, although the conjunction fallacy can be tricky here. Present the target with two choices: (Hope they don't realize it's a false dichotomy)

Choice A: Send in a small amount of utility. (If Choice A is selected, repeat False dichotomy) Choice B: Allow a (fake) opportunity to acquire a large amount of (fake) utility to slip away.

Background: Make a MASSIVE number of attempts. even if the first 10,000 attempts fail, the cost of trying to hack is minimal compared to the rewards.

So to reduce it to a simpler problem, the first question seems to be, how do we create the best known spam filter we have right now?

And then the second question seems to be "How can we make a spam filter MUCH better than that?" to protect our Lovable Senile Billionaire Grandpa who has Nuclear Weapons and a tendency to believe everything, and who relies on emails for critical world altering decisions so a single false positive or false negative means terrible costs levels of epic spam filtering.

So to attempt to help that, I'll try to list all of the Antispam tactics I can find, to at least help with the first part.

Bayesian Spam filtering: I was going to try to summarize this, but honestly, the Wikipedia article does a better job then I can: http://en.wikipedia.org/wiki/Bayesian_spam_filtering

Training Phase: The Training Phase for a Bayesian Spam Filter which needs to go live in a super hostile environment should be as long and thorough as it possibly can. You know how some places use validation and some use bounty testing? We should use both.

Sysadmin: Multiple someones need to begin by reviewing everything. Then, they need to continue by reviewing everything. There's a built in Human tendency to ignore risks after you've been dealing with them for a while and nothing has happened. I don't know what it's called exactly, but presumably there are countermeasures in place at top secret type facilities that need extremely vigilant security guards at all times. We need to begin by doing those, and then again, validate and offer bounties to hackers while the system isn't live.

Secrecy: Many explicit, open list of countermeasures can generally be planned around by a determined hacker. Hackers can't plan for security measures they aren't aware of. The secrecy ALSO needs to be validated and bounty tested.

At first, this does sound a bit contradictory (How do you do a open source test of "Secrecy"? But you'd want to do that first, before say, having the FAI develop it's own spam filtering that the public shouldn't know. Google has this problem sometimes where they battle with Search Engine Optimizers who are trying to fake having genuine good content when they are, in fact, irrelevant and trying to sell you on lies to make money (Much like our Muggers, really). We need to find out how additional ways to fight spamdexing as well. This has another good Wikipedia page: http://en.wikipedia.org/wiki/Spamdexing

The Bounty system is important because we want to take advantage of temporal discounting. People will frequently take small amounts of utility now over large amounts of utility later even to irrational levels, so we need to offer bribes so that the kinds of people who might try to trick the FAI later come to trick it during development while we would still be actively fixing problems and it didn't have massive responsibilities.

From my personal experience coding, another good way to make sure your code is developed well enough to withstand all sorts of attacks and problems is to have an incredibly robust set of test data. Problems that can't be seen with a single version and 50 records and 2 users often pop up across multiple versions and 1 million records and 70 users. So that as well, but more so.

A lot of this may be common knowledge already, but I thought listing everything I knew would be a good starting point for additional security measures.

14

St. Petersburg Mugging Implies You Have Bounded Utility

14

14

14

St. Petersburg Mugging Implies You Have Bounded Utility

14

14