St. Petersburg Mugging Implies You Have Bounded Utility

10 TimFreeman 07 June 2011 03:06PM

This post describes an infinite gamble that, under some reasonable assumptions, will motivate people who act to maximize an unbounded utility function to send me all their money. In other words, if you understand this post and it doesn't motivate you to send me all your money, then you have a bounded utility function, or perhaps even upon reflection you are not choosing your actions to maximize expected utility, or perhaps you found a flaw in this post.

continue reading »

A summary of Savage's foundations for probability and utility.

34 Sniffnoy 22 May 2011 07:56PM

Edit: I think the P2c I wrote originally may have been a bit too weak; fixed that. Nevermind, rechecking, that wasn't needed.

More edits (now consolidated): Edited nontriviality note. Edited totality note. Added in the definition of numerical probability in terms of qualitative probability (though not the proof that it works). Also slight clarifications on implications of P6' and P6''' on partitions into equivalent and almost-equivalent parts, respectively.

One very late edit, June 2: Even though we don't get countable additivity, we still want a σ-algebra rather than just an algebra (this is needed for some of the proofs in the "partition conditions" section that I don't go into here). Also noted nonemptiness of gambles.

The idea that rational agents act in a manner isomorphic to expected-utility maximizers is often used here, typically justified with the Von Neumann-Morgenstern theorem.  (The last of Von Neumann and Morgenstern's axioms, the independence axiom, can be grounded in a Dutch book argument.)  But the Von Neumann-Morgenstern theorem assumes that the agent already measures its beliefs with (finitely additive) probabilities.  This in turn is often justified with Cox's theorem (valid so long as we assume a "large world", which is implied by e.g. the existence of a fair coin).  But Cox's theorem assumes as an axiom that the plausibility of a statement is taken to be a real number, a very large assumption!  I have also seen this justified here with Dutch book arguments, but these all seem to assume that we are already using some notion of expected utility maximization (which is not only somewhat circular, but also a considerably stronger assumption than that plausibilities are measured with real numbers).

There is a way of grounding both (finitely additive) probability and utility simultaneously, however, as detailed by Leonard Savage in his Foundations of Statistics (1954).  In this article I will state the axioms and definitions he gives, give a summary of their logical structure, and suggest a slight modification (which is equivalent mathematically but slightly more philosophically satisfying).  I would also like to ask the question: To what extent can these axioms be grounded in Dutch book arguments or other more basic principles?  I warn the reader that I have not worked through all the proofs myself and I suggest simply finding a copy of the book if you want more detail.

Peter Fishburn later showed in Utility Theory for Decision Making (1970) that the axioms set forth here actually imply that utility is bounded.

(Note: The versions of the axioms and definitions in the end papers are formulated slightly differently from the ones in the text of the book, and in the 1954 version have an error. I'll be using the ones from the text, though in some cases I'll reformulate them slightly.)

continue reading »

How I Lost 100 Pounds Using TDT

70 Zvi 14 March 2011 03:50PM

Background Information: Ingredients of Timeless Decision Theory

Alternate Approaches Include: Self-empathy as a source of “willpower”, Applied Picoeconomics, Akrasia, hyperbolic discounting, and picoeconomics, Akrasia Tactics Review

Standard Disclaimer: Beware of Other-Optimizing

Timeless Decision Theory (or TDT) allowed me to succeed in gaining control over when and how much I ate in a way that previous attempts at precommitment had repeatedly failed to do. I did so well before I was formally exposed to the concept of TDT, but once I clicked on TDT I understood that I had effectively been using it. That click came from reading Eliezer’s shortest summary of TDT, which was:

The one-sentence version is:  Choose as though controlling the logical output of the abstract computation you implement, including the output of all other instantiations and simulations of that computation

You can find more here but my recommendation at least at first is to stick with the one sentence version. It is as simple as it can be, but no simpler. 

Utilizing TDT gave me several key abilities that I previously lacked. The most important was realizing that what I chose now would be the same choice I would make at other times under the same circumstances. This allowed me to compare having the benefits now to paying the costs now, as opposed to paying costs now for future benefits later. This ability allowed me to overcome hyperbolic discounting. The other key ability was that it freed me from the need to explicitly stop in advance to make precommitements each time I wanted to alter my instinctive behavior. Instead, it became automatic to make decisions in terms of which rules would be best to follow.

continue reading »

Updateless anthropics

8 Stuart_Armstrong 20 February 2011 07:23PM

Three weeks ago, I set out to find a new theory of anthropics, to try and set decision theory on a firm footing with respect to copying, deleting copies, merging them, correlated decisions, and the presence or absence of extra observers. I've since come full circle, and realised that UDT already has a built-in anthropic theory, that resolves a lot of the problems that had been confusing me.

The theory is simple, and is essentially a rephrasing of UDT: if you are facing a decision X, and trying to figure out the utility of X=a for some action a, then calculate the full expected utility of X being a, given the objective probabilities of each world (including those in which you don't exist).

As usual, you have to consider the consequences of X=a for all agents who will make the same decision as you, whether they be exact copies, enemies, simulations or similar-minded people. However, your utility will have to do more work that is usually realised: notions such as selfishness or altruism with respect to your copies have to be encoded in the utility function, and will result in substantially different behaviour.

The rest of the post is a series of cases-studies illustrating this theory. Utility is assumed to be linear in cash for convenience.

Sleeping with the Presumptuous Philosopher 

The first test case is the Sleeping Beauty problem.


 In its simplest form, this involves a coin toss; if it comes out heads, one copy of Sleeping Beauty is created. If it comes out tails, two copies are created. Then the copies are asked at what odds they would be prepared to bet that the coin came out tails. You can assume either that the different copies care for each other in the manner I detailed here, or more simply that all winnings will be kept by a future merged copy (or an approved charity). Then the algorithm is simple: the two worlds have equal probability. Let X be the decision where sleeping beauty decides between a contract that pays out $1 if the coin is heads, versus one that pays out $1 if the coin is tails. If X="heads" (to use an obvious shorthand), then Sleeping Beauty will expect to make $1*0.5, as she is offered the contract once. If X="tails", then the total return of that decision is $1*2*0.5, as copies of her will be offered the contract twice, and they will all make the same decision. So Sleeping Beauty will follow the SIA 2:1 betting odds of tails over heads.

Variants such as "extreme Sleeping Beauty" (where thousands of copies are created on tails) will behave in the same way; if it feels counter-intuitive to bet at thousands-to-one odds that a fair coin landed tails, it's the fault of expected utility itself, as the rewards of being right dwarf the costs of being wrong.

But now let's turn to the Presumptuous Philosopher, a thought experiment that is often confused with Sleeping Beauty. Here we have exactly the same setup as "extreme Sleeping Beauty", but the agents (the Presumptuous philosophers) are mutually selfish. Here the return to X="heads" remains $1*0.5. However the return to X="tails" is also $1*0.5, since even if all the Presumptuous Philosophers in the "tails" universe bet on "tails", each one will still only get $1 in utility. So the Presumptuous Philosopher should only take even SSA betting 1:1 odds on the result of the coin flip.

So SB is acts like she follows the self-indication assumption, (SIA), and while the PP is following the self-sampling assumption (SSA). This remains true if we change the setup so that one agent is given a betting opportunity in the tails universe. Then the objective probability of any one agent being asked is low, so both SB and PP model the "objective probability" of the tails world, given that they have been asked to bet, as being low. However, SB gains utility if any of her copies is asked to bet and receives a profit, so the strategy "if I'm offered $1 if I guess correctly whether the coin is heads or tails, I will say tails" gets her $1*0.5 utility whether or not she is the specific one who is asked. Betting heads nets her the same result, so SB will give SIA 1:1 odds in this case.

On the other hand, the PP will only gain utility in the very specific world where he himself is asked to bet. So his gain from the updateless "if I'm offered $1 if I guess correctly whether the coin is heads or tails, I will say tails" is tiny, as he's unlikely to be asked to bet. Hence he will offer the SSA odds that make heads a much more "likely" proposition.

The Doomsday argument

Now, using SSA odds brings us back into the realm of the classical Doomsday argument. How is it that Sleeping Beauty is immune to the Doomsday argument while the Presumptuous Philosopher is not? Which one is right; is the world really about to end?

Asking about probabilities independently of decisions is meaningless here; instead, we can ask what would agents decide in particular cases. It's not surprising that agents will reach different decisions on such questions as, for instance, existential risk mitigation, if they have different preferences.

Let's do a very simplified model, where there are two agents in the world, and that one of them is approached at random to see if they would pay $Y to add a third agent. Each agent derives a (non-indexical) utility of $1 for the presence of this third agent, and nothing else happens in the world to increase or decrease anyone's utility.

First, let's assume that each agent is selfish about their indexical utility (their cash in the hand). If the decision is to not add a third agent, all will get $0 utility. If the decision is to add a third agent, then there are three agents in the world, and one them will be approached to lose $Y. Hence the expected utility is $(1-Y/3).

Now let us assume the agents are altruistic towards each other's indexical utilities. Then the expected utility of not adding a third agent is still $0. If the decision is to add a third agent, then there are three agents in the world, and one of them will be approached to lose $Y - but all will value that lose at the same amount. Hence the expected utility is $(1-Y).

So if $Y=$2, for instance, the "selfish" agents will add the third agent, and the "altruistic" ones will not. So generalising this to more complicated models describing existential risk mitigations schemes, we would expect SB-type agents to behave differently to PP-types in most models. There is no sense in asking which one is "right" and which one gives the more accurate "probability of doom"; instead ask yourself which better corresponds to your own utility model, hence what your decision will be.

Psy-Kosh's non-anthropic problem

Cousin_it has a rephrasing of Psy-Kosh's non-anthropic problem to which updateless anthropics can be illustratively applied:

You are one of a group of 10 people who care about saving African kids. You will all be put in separate rooms, then I will flip a coin. If the coin comes up heads, a random one of you will be designated as the "decider". If it comes up tails, nine of you will be designated as "deciders". Next, I will tell everyone their status, without telling the status of others. Each decider will be asked to say "yea" or "nay". If the coin came up tails and all nine deciders say "yea", I donate $1000 to VillageReach. If the coin came up heads and the sole decider says "yea", I donate only $100. If all deciders say "nay", I donate $700 regardless of the result of the coin toss. If the deciders disagree, I don't donate anything.

We'll set aside the "deciders disagree" and assume that you will all reach the same decision. The point of the problem was to illustrate a supposed preference inversion: if you coordinate ahead of time, you should all agree to say "nay", but after you have been told you're a decider, you should update in the direction of the coin coming up tails, and say "yea".

From the updateless perspective, however, there is no mystery here: the strategy "if I were a decider, I would say nay" maximises utility both for the deciders and the non-deciders.

But what if the problem were rephrased in a more selfish way, with the non-deciders not getting any utility from the setup (maybe they don't get to see the photos of the grateful saved African kids), while the deciders got the same utility as before? Then the strategy "if I were a decider, I would say yea" maximises your expect utility, because non-deciders get nothing, thus reducing the expected utility gains and losses in the world where the coin came out tails. This is similar to SIA odds, again.

That second model is similar to the way I argued for SIA with agents getting created and destroyed. That post has been superseded by this one, which pointed out the flaw in the argument which was (roughly speaking) not considering setups like Psy-Kosh's original model. So once again, whether utility is broadly shared or not affects the outcome of the decision.

The Anthropic Trilemma

Eliezer's anthropic trilemma was an interesting puzzle involving probabilities, copying, and subjective anticipation. It inspired me to come up with a way of spreading utility across multiple copies which was essentially a Sleeping Beauty copy-altruistic model. The decision process going with it is then the same as the updateless decision process outlined here. Though initially it was phrased in terms of SIA probabilities and individual impact, the isomorphism between the two can be seen here.

You're in Newcomb's Box

40 HonoreDB 05 February 2011 08:46PM

Part 1:  Transparent Newcomb with your existence at stake

Related: Newcomb's Problem and Regret of Rationality

 

Omega, a wise and trustworthy being, presents you with a one-time-only game and a surprising revelation.  

 

"I have here two boxes, each containing $100," he says.  "You may choose to take both Box A and Box B, or just Box B.  You get all the money in the box or boxes you take, and there will be no other consequences of any kind.  But before you choose, there is something I must tell you."

 

Omega pauses portentously.

 

"You were created by a god: a being called Prometheus.  Prometheus was neither omniscient nor particularly benevolent.  He was given a large set of blueprints for possible human embryos, and for each blueprint that pleased him he created that embryo and implanted it in a human woman.  Here was how he judged the blueprints: any that he guessed would grow into a person who would choose only Box B in this situation, he created.  If he judged that the embryo would grow into a person who chose both boxes, he filed that blueprint away unused.  Prometheus's predictive ability was not perfect, but it was very strong; he was the god, after all, of Foresight."

 

Do you take both boxes, or only Box B?

continue reading »

Counterfactual Calculation and Observational Knowledge

11 Vladimir_Nesov 31 January 2011 04:28PM

Consider the following thought experiment ("Counterfactual Calculation"):

You are taking a test, which includes a question: "Is Q an even number?", where Q is a complicated formula that resolves to some natural number. There is no a priori reason for you to expect that Q is more likely even or odd, and the formula is too complicated to compute the number (or its parity) on your own. Fortunately, you have an old calculator, which you can use to type in the formula and observe the parity of the result on display. This calculator is not very reliable, and is only correct 99% of the time, furthermore its errors are stochastic (or even involve quantum randomness), so for any given problem statement, it's probably correct but has a chance of making an error. You type in the formula and observe the result (it's "even"). You're now 99% sure that the answer is "even", so naturally you write that down on the test sheet.

Then, unsurprisingly, Omega (a trustworthy all-powerful device) appears and presents you with the following decision. Consider the counterfactual where the calculator displayed "odd" instead of "even", after you've just typed in the (same) formula Q, on the same occasion (i.e. all possible worlds that fit this description). The counterfactual diverges only in the calculator showing a different result (and what follows). You are to determine what is to be written (by Omega, at your command) as the final answer to the same question on the test sheet in that counterfactual (the actions of your counterfactual self who takes the test in the counterfactual are ignored).

continue reading »

In the Pareto-optimised crowd, be sure to know your place

4 Stuart_Armstrong 07 January 2011 06:12PM

tldr: In a population playing independent two-player games, Pareto-optimal outcomes are only possible if there is an agreed universal scale of value relating each players' utility, and the players then acts to maximise the scaled sum of all utilities.

In a previous post, I showed that if you are about the play a bargaining game with someone when the game's rules are initially unknown, then the best plan is not to settle on a standard result like the Nash Bargaining Solution or the Kalai-Smorodinsky Bargaining Solution (see this post). Rather, it is to decide in advance how much your respective utilities are worth relative to each other, and then maximise their sum. Specifically, if you both have (representatives of) utility functions u1 and u2, then you must pick a θ>0 and maximise u1+θu2 (with certain extra measures to break ties). This result also applies if the players are to play a series of known independent games in sequence. But how does this extend to more than two players?

Consider the case where there are three players (named imaginatively 1, 2 and 3), and that they are going to pair off in each of the possible pairs (12, 23 and 31) and each play a game. The utility gains from each game are presumed to be independent. Then each of the pairs will choose factors θ12, θ23 and θ31, and seek to maximise u112u2, u223u3 and u331u1 respectively. Note here that I am neglecting tie-breaking and such; the formal definitions needed will be given in the proof section.

A very interesting situation comes up when θ12θ23θ31=1. In that case, there is an universal scale of "worth" for each of the utilities: it's as if the three utilities are pounds, dollars and euros. Once you know the exchange rate from pounds to dollars (θ12), and from dollars to euros (θ23), then you know the exchange rate from euros to pounds (θ31=1/(θ12θ23)). We'll call these situations transitive.

Ideally we'd want the outcomes to be Pareto-optimal for the three utilities. Then the major result is:

The outcome utilities are Pareto-optimal if and only if the θ are transitive.

continue reading »

Unpacking the Concept of "Blackmail"

25 Vladimir_Nesov 10 December 2010 12:53AM

Keep in mind: Controlling Constant Programs, Notion of Preference in Ambient Control.

There is a reasonable game-theoretic heuristic, "don't respond to blackmail" or "don't negotiate with terrorists". But what is actually meant by the word "blackmail" here? Does it have a place as a fundamental decision-theoretic concept, or is it merely an affective category, a class of situations activating a certain psychological adaptation that expresses disapproval of certain decisions and on the net protects (benefits) you, like those adaptation that respond to "being rude" or "offense"?

We, as humans, have a concept of "default", "do nothing strategy". The other plans can be compared to the moral value of the default. Doing harm would be something worse than the default, doing good something better than the default.

Blackmail is then a situation where by decision of another agent ("blackmailer"), you are presented with two options, both of which are harmful to you (worse than the default), and one of which is better for the blackmailer. The alternative (if the blackmailer decides not to blackmail) is the default.

Compare this with the same scenario, but with the "default" action of the other agent being worse for you than the given options. This would be called normal bargaining, as in trade, where both parties benefit from exchange of goods, but to a different extent depending on which cost is set.

Why is the "default" special here?

continue reading »

If you don't know the name of the game, just tell me what I mean to you

9 Stuart_Armstrong 26 October 2010 01:43PM

Following: Let's split the Cake

tl;dr: Both the Nash Bargaining solution (NBS), and the Kalai-Smorodinsky Bargaining Solution (KSBS), though acceptable for one-off games that are fully known in advance, are strictly inferior for independent repeated games, or when there exists uncertainty as to which game will be played.

Let play a bargaining game, you and I. We can end up with you getting €1 and me getting €3, both of us getting €2, or you getting €3 and me getting €1. If we fail to agree, neither of us gets anything.

Oh, and did I forget to mention that another option was for you to get an aircraft carrier and me to get nothing?

Think of that shiny new aircraft carrier, loaded full with jets, pilots, weapons and sailors; think of all the things you could do with it, all the fun you could have. Places to bomb or city harbours to cruise majestically into, with the locals gaping in awe at the sleek powerful lines of your very own ship.

Then forget all about it, because Kalai-Smorodinsky says you can't have it. The Kalai-Smorodinsky bargaining solution to this game is 1/2 of a chance of getting that ship for you, and 1/2 of a chance of getting €3 for me (the Nash Bargaining Solution is better, but still not the best, as we'll see later). This might be fair; after all, unless you have some way of remunerating me for letting you have it, why should I take a dive for you?

But now imagine we are about to start the game, and we don't know the full rules yet. We know about the €'s involved, that's all fine, we know there will be an offer of an aircraft carrier; but we don't know who is going to get the offer. If we wanted to decide on our bargaining theory in advance, what would we do?

continue reading »

Let's split the cake, lengthwise, upwise and slantwise

43 Stuart_Armstrong 25 October 2010 01:15PM

This post looks at some of the current models for how two agents can split the gain in non-zero sum interactions. For instance, if you and a buddy have to split £10 between the two of you, where the money is discarded if you can't reach a deal. Or there is an opportunity to trade your Elvis memorabilia for someone's collection of North Korean propaganda posters: unless you can agree to a way of splitting the gain from trade, the trade won't happen. Or there is the stereotypical battle of the sexes: either a romantic dinner (RD) or a night of Battlestar Galactica (BG) is on offer, and both members of the couple prefer doing something together to doing it separately - but, of course, each one has their preference on which activity to do together.

Unlike standard games such as Prisoner's Dilemma, this is a coordinated game: the two agents will negotiate a joint solution, which is presumed to be binding. This allows for such solutions as 50% (BG,BG) + 50% (RD,RD), which cannot happen with each agent choosing their moves independently. The two agents will be assumed to be expected utility maximisers. What would your feelings be on a good bargaining outcome in this situation?

Enough about your feelings; let's see what the experts are saying. In general, if A and C are outcomes with utilities (a,b) and (c,d), then another possible outcome is pA + (1-p)C (where you decide first, with odds p:1-p, whether to do outcome A or C), with utility p(a,b) + (1-p)(c,d). Hence if you plot every possible expected utilities in the plane for a given game, you get a convex set.

For instance, if there is an interaction with possible pure outcomes (-1,2), (2,10), (4,9), (5,7), (6,3), then the set of actual possible utilities is the pentagon presented here:

continue reading »

View more: Prev | Next