I don't really think Newcomb's problem or any of its variations belong in here. Newcomb's problem is not a decision theory problem, the real difficulty is translating the underspecified English into a payoff matrix.
The ambiguity comes from the the combination of the two claims, (a) Omega being a perfect predictor and (b) the subject being allowed to choose after Omega has made its prediction. Either these two are inconsistent, or they necessitate further unstated assumptions such as backwards causality.
First, let us assume (a) but not (b), which can be formulated as follows: Omega, a computer engineer, can read your code and test run it as many times as he would like in advance. You must submit (simple, unobfuscated) code which either chooses to one- or two-box. The contents of the boxes will depend on Omega's prediction of your code's choice. Do you submit one- or two-boxing code?
Second, let us assume (b) but not (a), which can be formulated as follows: Omega has subjected you to the Newcomb's setup, but because of a bug in its code, its prediction is based on someone else's choice than yours, which has no correlation with your choice whatsoever. Do you one- or two-box?
Both of the...
(Thanks for discussing!)
I will address your last paragraph first. The only significant difference between my original example and the proper Newcomb's paradox is that, in Newcomb's paradox, Omega is made a predictor by fiat and without explanation. This allows perfect prediction and choice to sneak into the same paragraph without obvious contradiction. It seems, if I try to make the mode of prediction transparent, you protest there is no choice being made.
From Omega's point of view, its Newcomb subjects are not making choices in any substantial sense, they are just predictably acting out their own personality. That is what allows Omega its predictive power. Choice is not something inherent to a system, but a feature of an outsider's model of a system, in much the same sense as random is not something inherent to a Eeny, meeny, miny, moe however much it might seem that way to children.
As for the rest of our disagreement, I am not sure why you insist that CDT must work with a misleading model. The standard formulation of Newcomb's paradox is inconsistent or underspecified. Here are some messy explanations for why, in list form:
Thanks for your post, it was a good summary of decision theory basics. Some corrections:
In the Allais paradox, choice (2A) should be "A 34% chance of 24,000$ and a 66% chance of nothing" (now 27,000$).
A typo in title 10.3.1., the title should probably be "Why should degrees of belief follow the laws of probability?".
In 11.1.10. Prisoner's dilemma, the Resnik quotation mentions a twenty-five year term, yet the decision matrix has "20 years in jail" as an outcome.
Easy explanation for the Ellsberg Paradox: We humans treat the urn as if it was subjected to two kinds of uncertainties.
Somehow, we prefer to chose the "truly random" option. I think I can sense why: when it's "truly random", I know no potentially hostile agent messed up with me. I mean, I could chose "red" in situation A, but then the organizers could have put 60 blue balls just to mess with me!
Put it simply, choosing "red" opens me up for external sentient influence, and therefore risk being outsmarted. This particular risk aversion sounds like a pretty sound heuristic.
What about mentioning the St. Petersburg paradox? This is a pretty striking issue for EUM, IMHO.
I'm finding the "counterfactual mugging" challenging. At this point, the rules of the game seem to be "design a thoughtless, inert, unthinking algorithm, such as CDT or EDT or BT or TDT, which will always give the winning answer." Fine. But for the entire range of Newcomb's problems, we are pitting this dumb-as-a-rock algo against a super-intelligence. By the time we get to the counterfactual mugging, we seem to have a scenario where omega is saying "I will reward you only if you are a trusting rube who can be fleeced." N...
VNM utility isn't any of the types you listed. Ratios (a-b)/|c-d| of VNM utilities aren't meaningful, only ratios (a-b)/|c-b|.
I would *really* appreciate any help from lesswrong readers in helping me understand something really basic about the standard money pump argument for transitivity of preferences.
So clearly there can be situations, like in a game of Rock Scissors Paper (or games featuring non-transitive dice, like 'Efron's dice') where faced with pairwise choices it seems rational to have non-transitive preferences. And it could be that these non-transitive games/situations pay out money (or utility or whatever) if you make the right choice.
But so then if ...
Presentation of Newcomb's problem in section 11.1.1. seems faulty. What if the human flips a coin to determine whether to one-box or two-box? (or any suitable source of entropy that is beyond the predictive powers of the super-intelligence.) What happens then?
This point is danced around in the next section, but never stated outright: EDT provides exactly the right answer if humans are fully deterministic and predictable by the superintelligence. CDT gives the right answer if the human employs an unpredictable entropy source in their decision-making. It is the entropy source that makes the decision acausal from the acts of the super-intelligence.
Small correction, Arntzenius name has a Z (that paper is great by the way, I sent it to Yudkwosky a while ago).
There is a compliment true of both this post and of that paper, they are both very well condensed. Congratulations Luke and crazy88!
In the VNM system, utility is defined via preferences over acts rather than preferences over outcomes. To many, it seems odd to define utility with respect to preferences over risky acts. After all, even an agent who thinks she lives in a world where every act is certain to result in a known outcome could have preferences for some outcomes over others. Many would argue that utility should be defined in relation to preferences over outcomes or world-states, and that's not what the VNM system does. (Also see section 9.)
It's misleading to associate acts wi...
In this case, even if an extremely low value is set for L, it seems that paying this amount to play the game is unreasonable. After all, as Peterson notes, about nine times out of ten an agent that plays this game will win no more than 8 · 10-100 utility.
It seems there's an error here. Should it be "In this case, even if an extremely high value is set for L, it seems that paying a lot to play the game is unreasonable."?
Typo:
Usually, it is argued that each of the axioms are pragmatically justified because an agent which violates the axioms can face situations in which they are guaranteed end up worse off (from their own perspective).
Should read:
guaranteed to end up worse off
Does the horizontal axis of the decision tree in section 3 represent time? If so, I'd advocate smearing those red triangles out over the whole history of actions and events. Even though, in the particular example, it's unlikely that the agent cares about having been insured as such, apart from the monetary payoffs, in the general case agents care about the whole history. I think that forgetting this point sometimes leads to misapplications of decision theory.
When reading about Transparent Newcomb's problem: Isn't this perfectly general? Suppose Omega says: I give everyone who subscribes to decision theory A $1000, and give those who subscribe to other decision theories nothing. Clearly everyone who subscribes to decision theory A "wins".
It seems that if one lives in the world with many such Omegas, and subscribing to decision theory A (vs subscribing to decision theory B) would otherwise lead to losing at most, say, $100 per day between two successive encounters with such Omegas, then one would wi...
Maybe worth noting that there's recommended reading on decision theory on the "Best textbooks on every subject" post.
On decision theory, lukeprog recommends Peterson's An Introduction to Decision Theory over Resnik's Choices and Luce & Raiffa's Games and Decisions.
In this equation, V(A & O) represents the value to the agent of the combination of an act and an outcome. So this is the utility that the agent will receive if they carry out a certain act and a certain outcome occurs. Further, PrAO represents the probability of each outcome occurring on the supposition that the agent carries out a certain act. It is in terms of this probability that CDT and EDT differ. EDT uses the conditional probability, Pr(O|A), while CDT uses the probability of subjunctive conditionals, Pr(A O).
Please, don't use the same letters ...
...The second problem with using the law of large numbers to justify EUM has to do with a mathematical theorem known as gambler's ruin. Imagine that you and I flip a fair coin, and I pay you $1 every time it comes up heads and you pay me $1 every time it comes up tails. We both start with $100. If we flip the coin enough times, one of us will face a situation in which the sequence of heads or tails is longer than we can afford. If a long-enough sequence of heads comes up, I'll run out of $1 bills with which to pay you. If a long-enough sequence of tails comes
I think the example given to show the irrationality of leximin in certain situations doesn’t do a good job of distinguishing its failings from maximin. To usefully illustrate the difference between the two I believe a another state is required with even worse outcomes for both acts (e.g. $0). This way the worst outcomes for both acts would be equal and so the second worst outcomes (a1:$1, a2:$1.01) would then be compared under the leximin strategy leading to the choice of a2 as the best act again with the acknowledgment that you miss out on the opportunity to get $10,001.01
In the last chapter of his book "Utility Theory for Decision Making," Peter Fishburn published a concise rendering of Leonard Savage's proof that "rational" preferences over events implied that one behaved "as if" he (or she) was obeying Expected Utility Theory. He furthermore proved that following Savage's axioms implied that your utility function is bounded (he attributes this extension of the proof, in its essence, to Savage). So Subjective Expected Utility Theory has an answer to the St. Petersburg Paradox "built in" to its axioms. That seems like a point well worth mentioning in this article.
The image of Ellsberg's Paradox has the picture of the Yellow/Blue bet replaced with a picture of a Yellow/Red bet. Having looked at the picture I was about to claim that it was always rational to take the R/B bet over Y/R before I read the actual description.
Isn't there a typo in "Experiments have shown that many people prefer (1A) to (1B) and (2B) to (2A)." ? Shouldn't it be "(2A) to (2B)" ?
Edit : hrm, no, in fact it's like http://lesswrong.com/lw/gu1/decision_theory_faq/8jav said : it should be 24 000$ instead of 27 000$ in option A, or else it makes no sense.
Thus, the expected utility (EU) of choice A is, for this decision maker, (1)(1000) = 1000. Meanwhile, the EU of choice B is (0.5)(1500) + (0.5)(0) = 750. In this case, the expected utility of choice B is greater than that of choice A, even though choice B has a greater expected monetary value.
Choice A at 1000 is still greater than Choice B at 750
Minor error: In the prisoner's dilemma example, the decision matrix has twenty years for if you cooperate and your partner defects, while the text quoted right above the matrix claims that that amount is twenty five years.
I find it helpful to use the term "security level" to understand maximin/leximin and "hope level" to understand maximax. "Security level" is the worst case scenario, and under maximin/leximin we want to maximize it. "Hope level" is the best case scenario, and under maximax, we want to maximize it.
Concerning the transitivity axiom, what about rational choices in situations of intransitivity cycles?
(Well, sort of. The minimax and maximax principles require only that we measure value on an ordinal scale, whereas the optimism-pessimism rule requires that we measure value on an interval scale.)
I'm using this as an introduction to decision theory so I might be wrong, and I've read that 'maximin' and 'minimax' do have different meanings in game theory, but you exclusively use the term 'maximin' up to a certain point and then mention a 'minimax principle' once, so I can only imagine that you meant to write 'maximin principle.' It confused me. It's proba...
...Another objection to the VNM approach (and to expected utility approaches generally), the St. Petersburg paradox, draws on the possibility of infinite utilities. The St. Petersburg paradox is based around a game where a fair coin is tossed until it lands heads up. At this point, the agent receives a prize worth 2n utility, where n is equal to the number of times the coin was tossed during the game. The so-called paradox occurs because the expected utility of choosing to play this game is infinite and so, according to a standard expected utility approach,
In section 8.1, your example of the gambler's ruin postulates that both agents have the same starting resources, but this is exactly the case in which the gambler's ruin doesn't apply. That might be worth changing.
- Can decisions under ignorance be transformed into decisions under uncertainty?
I'd add a comment on Jaynes' solution for determining ignorance priors in terms of transformation groups.
I'd say that there's no such think as an "ignorance" prior - priors are set by information. Setting a prior by symmetry or the more general transformation group is an assertion of information.
There are numerous typos throughout the thing. Someone needs to re-read it. The math in "8.6.3. The Allais paradox" is all wrong, option 2A is not actually 34% of 1A and 66% of nothing, etc.
This may not be the best place for this question, but it's something I've been wondering for a while: how does causal decision theory fail us humans in the real world, here and now?
There is one rather annoying subtext that recurs throughout the FAQ: the very casual and carefree use of the words "rational" and "irrational", with the rather flawed idea that following some axiomatic system (e.g. VNM) and Bayes is "rational" and not doing so is "irrational". I think this is a dis-service, and, what's more, fails to look into the effects of intelligence, experience, training and emotion. The Allias paradox scratches the surface, as do various psych experiments. But ...
The real question is "wh...
The conclusion to section "11.1.3. Medical Newcomb problems" begs a question which remains unanswered: -- "So just as CDT “loses” on Newcomb’s problem, EDT will "lose” on Medical Newcomb problems (if the tickle defense fails) or will join CDT and "lose" on Newcomb’s Problem itself (if the tickle defense succeeds)."
If I was designing a self-driving car and had to provide an algorithm for what to do during an emergency, I may choose to hard-code CDT or EDT into the system, as seems appropriate. However, as an intelligen...
But note burger-choosing Jane (6.1) is still irrational - for she has discounted the much stronger preference of a cow not to be harmed. Rationality entails overcoming egocentric bias - and ethnocentric and anthropocentric bias - and adopting a God's eye point-of-view that impartially gives weight to all possible first-person perspectives.
When we say 'rationality', we mean instrumental rationality; getting what you want. Elsewhere, we also refer to epistemic rationality, which is believing true things. In neither case do we say anything about what you should want.
It might be a good thing to care about cows, but it's not rationality as we understand the word. Good you bring this up though, as I can easily imagine others being confused.
See also What Do We Mean by Rationality
Isn't the giant elephant in this room the whole issue of moral realism? I'm a moral cognitivist but not a moral realist. I have laid out what it means for my moral beliefs to be true - the combination of physical fact and logical function against which my moral judgments are being compared. This gives my moral beliefs truth value. And having laid this out, it becomes perfectly obvious that it's possible to build powerful optimizers who are not motivated by what I call moral truths; they are maximizing something other than morality, like paperclips. They will also meta-maximize something other than morality if you ask them to choose between possible utility functions, and will quite predictably go on picking the utility function "maximize paperclips". Just as I correctly know it is better to be moral than to be paperclippy, they accurately evaluate that it is more paperclippy to maximize paperclips than morality. They know damn well that they're making you unhappy and violating your strong preferences by doing so. It's just that all this talk about the preferences that feel so intrinsically motivating to you, is itself of no interest to them because you haven't got...
I'm not sure this taxonomy is helpful from David Pearce's perspective. David Pearce's position is that there are universally motivating facts - facts whose truth, once known, is compelling for every possible sort of mind. This reifies his observation that the desire for happiness feels really, actually compelling to him and this compellingness seems innate to qualia, so anyone who truly knew the facts about the quale would also know that compelling sense and act accordingly. This may not correspond exactly to what SEP says under moral realism and let me know if there's a standard term, but realism seems to describe the Pearcean (or Eliezer circa 1996) feeling about the subject - that happiness is really intrinsically preferable, that this is truth and not opinion.
From my perspective this is a confusion which I claim to fully and exactly understand, which licenses my definite rejection of the hypothesis. (The dawning of this understanding did in fact cause my definite rejection of the hypothesis in 2003.) The inherent-desirableness of happiness is your mind reifying the internal data describing its motivation to do something, so if you try to use your empathy to imagine another...
Eliezer, in my view, we don't need to assume meta-ethical realism to recognise that it's irrational - both epistemically irrational and instrumentally irrational - arbitrarily to privilege a weak preference over a strong preference.
You need some stage at which a fact grabs control of a mind, regardless of any other properties of its construction, and causes its motor output to have a certain value.
Paperclippers? Perhaps let us consider the mechanism by which paperclips can take on supreme value. We understand, in principle at least, how to make paperclips seem intrinsically supremely valuable to biological minds - more valuable than the prospect of happiness in the abstract. [“Happiness is a very pretty thing to feel, but very dry to talk about.” - Jeremy Bentham]. Experimentally, perhaps we might use imprinting (recall Lorenz and his goslings), microelectrodes implanted in the reward and punishment centres, behavioural conditioning and ideological indoctrination - and perhaps the promise of 72 virgins in the afterlife for the faithful paperclipper. The result: a fanatical paperclip fetishist!
As Sarokrae observes, this isn't the idea at all. We construct a paperclip maxim...
Anyone who is isn't profoundly disturbed by torture, for instance, or by agony so bad one would end the world to stop the horror, simply hasn't understood it.
Similarly, anyone who doesn't want to maximize paperclips simply hasn't understood the ineffable appeal of paperclipping.
"Aargh!" he said out loud in real life. David, are you disagreeing with me here or do you honestly not understand what I'm getting at?
The whole idea is that an agent can fully understand, model, predict, manipulate, and derive all relevant facts that could affect which actions lead to how many paperclips, regarding happiness, without having a pleasure-pain architecture. I don't have a paperclipping architecture but this doesn't stop me from modeling and understanding paperclipping architectures.
The paperclipper can model and predict an agent (you) that (a) operates on a pleasure-pain architecture and (b) has a self-model consisting of introspectively opaque elements which actually contain internally coded instructions for your brain to experience or want certain things (e.g. happiness). The paperclipper can fully understand how your workspace is modeling happiness and know exactly how much you would want happiness and why you write papers about the apparent ineffability of happiness, without being happy itself or at all sympathetic toward you. It will experience no future surprise on comprehending these things, because it already knows them. It doesn't have any objec...
As Kawoomba colorfully pointed out, clippy's subroutines simulating humans suffering may be fully sentient. However, unless those subroutines have privileged access to clippy's motor outputs or planning algorithms, clippy will go on acting as if he didn't care about suffering. He may even understand that inflicting suffering is morally wrong--but this will not make him avoid suffering, any more than a thrown rock with "suffering is wrong" painted on it will change direction to avoid someone's head. Moral wrongness is simply not a consideration that has the power to move a paperclip maximizer.
To slightly expand, if an intelligence is not prohibited from the following epistemic feats:
1) Be good at predicting which hypothetical actions would lead to how many paperclips, as a question of pure fact.
2) Be good at searching out possible plans which would lead to unusually high numbers of paperclips - answering the purely epistemic search question, "What sort of plan would lead to many paperclips existing, if someone followed it?"
3) Be good at predicting and searching out which possible minds would, if constructed, be good at (1), (2), and (3) as purely epistemic feats.
Then we can hook up this epistemic capability to a motor output and away it goes. You cannot defeat the Orthogonality Thesis without prohibiting superintelligences from accomplishing 1-3 as purely epistemic feats. They must be unable to know the answers to these questions of fact.
...microelectrodes implanted in the reward and punishment centres, behavioural conditioning and ideological indoctrination - and perhaps the promise of 72 virgins in the afterlife for the faithful paperclipper. The result: a fanatical paperclip fetishist!
Have to point out here that the above is emphatically not what Eliezer talks about when he says "maximise paperclips". Your examples above contain in themselves the actual, more intrisics values to which paperclips would be merely instrumental: feelings in your reward and punishment centres, virgins in the afterlife, and so on. You can re-wire the electrodes, or change the promise of what happens in the afterlife, and watch as the paperclip preference fades away.
What Eliezer is talking about is a being for whom "pleasure" and "pain" are not concepts. Paperclips ARE the reward. Lack of paperclips IS the punishment. Even if pleasure and pain are concepts, they are merely instrumental to obtaining more paperclips. Pleasure would be good because it results in paperclips, not vice versa. If you reverse the electrodes so that they stimulate the pain centre when they find paperclips, and the pleasure centr...
David, we're not defining rationality to exclude other-oriented desires. We're just not including that exact morality into the word "rational". Instrumental rationality links up a utility function to a set of actions. You hand over a utility function over outcomes, epistemic rationality maps the world and then instrumental rationality hands back a set of actions whose expected score is highest. So long as it can build a well-calibrated, highly discriminative model of the world and then navigate to a compactly specified set of outcomes, we call it rational, even if the optimization target is "produce as many paperclips as possible". Adding a further constraint to the utility function that it be perfectly altruistic will greatly reduce the set of hypothetical agents we're talking about, but it doesn't change reality (obviously) nor yield any interesting changes in terms of how the agent investigates hypotheses, the fact that the agent will not fall prey to the sunk cost fallacy if it is rational, and so on. Perfectly altruistic rational agents will use mostly the same cognitive strategies as any other sort of rational agent, they'll just be optimizing for on...
Accurately modeling the world entails making accurate predictions about it. An expected paperclip maximizer fully grasps the functioning of your brain and mind to the extent that this is relevant to producing paperclips; if it needs to know the secrets of your heart in order to persuade you, it knows them. If it needs to know why you write papers about the hard problem of conscious experience, it knows that too. The paperclip maximizer is not moved by grasping your first-person perspective, because although it has accurate knowledge of this fact, that is not the sort of fact that figures in its terminal values. The fact that it perfectly grasps the compellingness-to-Jane, even the reason why Jane finds certain facts to be inherently and mysteriously compelling, doesn't compel it. It's not a future paperclip.
I know exactly why the villain in Methods of Rationality wants to kill people. I could even write the villain writing about the ineffable compellingness of the urge to rid the world of certain people if I put that villain in a situation where he or she would actually read about the hard problem of conscious experience, and yet I am not likewise compelled. I don't have the...
Co-authored with crazy88. Please let us know when you find mistakes, and we'll fix them. Last updated 03-27-2013.
Contents:
1. What is decision theory?
Decision theory, also known as rational choice theory, concerns the study of preferences, uncertainties, and other issues related to making "optimal" or "rational" choices. It has been discussed by economists, psychologists, philosophers, mathematicians, statisticians, and computer scientists.
We can divide decision theory into three parts (Grant & Zandt 2009; Baron 2008). Normative decision theory studies what an ideal agent (a perfectly rational agent, with infinite computing power, etc.) would choose. Descriptive decision theory studies how non-ideal agents (e.g. humans) actually choose. Prescriptive decision theory studies how non-ideal agents can improve their decision-making (relative to the normative model) despite their imperfections.
For example, one's normative model might be expected utility theory, which says that a rational agent chooses the action with the highest expected utility. Replicated results in psychology describe humans repeatedly failing to maximize expected utility in particular, predictable ways: for example, they make some choices based not on potential future benefits but on irrelevant past efforts (the "sunk cost fallacy"). To help people avoid this error, some theorists prescribe some basic training in microeconomics, which has been shown to reduce the likelihood that humans will commit the sunk costs fallacy (Larrick et al. 1990). Thus, through a coordination of normative, descriptive, and prescriptive research we can help agents to succeed in life by acting more in accordance with the normative model than they otherwise would.
This FAQ focuses on normative decision theory. Good sources on descriptive and prescriptive decision theory include Stanovich (2010) and Hastie & Dawes (2009).
Two related fields beyond the scope of this FAQ are game theory and social choice theory. Game theory is the study of conflict and cooperation among multiple decision makers, and is thus sometimes called "interactive decision theory." Social choice theory is the study of making a collective decision by combining the preferences of multiple decision makers in various ways.
This FAQ draws heavily from two textbooks on decision theory: Resnik (1987) and Peterson (2009). It also draws from more recent results in decision theory, published in journals such as Synthese and Theory and Decision.
2. Is the rational decision always the right decision?
No. Peterson (2009, ch. 1) explains:
Unfortunately, we cannot know with certainty what the right decision is. Thus, the best we can do is to try to make "rational" or "optimal" decisions based on our preferences and incomplete information.
3. How can I better understand a decision problem?
First, we must formalize a decision problem. It usually helps to visualize the decision problem, too.
In decision theory, decision rules are only defined relative to a formalization of a given decision problem, and a formalization of a decision problem can be visualized in multiple ways. Here is an example from Peterson (2009, ch. 2):
The most common way to formalize a decision problem is to break it into states, acts, and outcomes. When facing a decision problem, the decision maker aims to choose the act that will have the best outcome. But the outcome of each act depends on the state of the world, which is unknown to the decision maker.
In this framework, speaking loosely, a state is a part of the world that is not an act (that can be performed now by the decision maker) or an outcome (the question of what, more precisely, states are is a complex question that is beyond the scope of this document). Luckily, not all states are relevant to a particular decision problem. We only need to take into account states that affect the agent's preference among acts. A simple formalization of the fire insurance problem might include only two states: the state in which your house doesn't (later) catch on fire, and the state in which your house does (later) catch on fire.
Presumably, the agent prefers some outcomes to others. Suppose the four conceivable outcomes in the above decision problem are: (1) House and $0, (2) House and -$100, (3) No house and $99,900, and (4) No house and $0. In this case, the decision maker might prefer outcome 1 over outcome 2, outcome 2 over outcome 3, and outcome 3 over outcome 4. (We'll discuss measures of value for outcomes in the next section.)
An act is commonly taken to be a function that takes one set of the possible states of the world as input and gives a particular outcome as output. For the above decision problem we could say that if the act "Take out insurance" has the world-state "Fire" as its input, then it will give the outcome "No house and $99,900" as its output.
An outline of the states, acts and outcomes in the insurance case
Note that decision theory is concerned with particular acts rather than generic acts, e.g. "sailing west in 1492" rather than "sailing." Moreover, the acts of a decision problem must be alternative acts, so that the decision maker has to choose exactly one act.
Once a decision problem has been formalized, it can then be visualized in any of several ways.
One way to visualize this decision problem is to use a decision matrix:
Another way to visualize this problem is to use a decision tree:
The square is a choice node, the circles are chance nodes, and the triangles are terminal nodes. At the choice node, the decision maker chooses which branch of the decision tree to take. At the chance nodes, nature decides which branch to follow. The triangles represent outcomes.
Of course, we could add more branches to each choice node and each chance node. We could also add more choice nodes, in which case we are representing a sequential decision problem. Finally, we could add probabilities to each branch, as long as the probabilities of all the branches extending from each single node sum to 1. And because a decision tree obeys the laws of probability theory, we can calculate the probability of any given node by multiplying the probabilities of all the branches preceding it.
Our decision problem could also be represented as a vector — an ordered list of mathematical objects that is perhaps most suitable for computers:
For more details on formalizing and visualizing decision problems, see Skinner (1993).
4. How can I measure an agent's preferences?
4.1. The concept of utility
It is important not to measure an agent's preferences in terms of objective value, e.g. monetary value. To see why, consider the absurdities that can result when we try to measure an agent's preference with money alone.
Suppose you may choose between (A) receiving a million dollars for sure, and (B) a 50% chance of winning either $3 million or nothing. The expected monetary value (EMV) of your act is computed by multiplying the monetary value of each possible outcome by its probability. So, the EMV of choice A is (1)($1 million) = $1 million. The EMV of choice B is (0.5)($3 million) + (0.5)($0) = $1.5 million. Choice B has a higher expected monetary value, and yet many people would prefer the guaranteed million.
Why? For many people, the difference between having $0 and $1 million is subjectively much larger than the difference between having $1 million and $3 million, even if the latter difference is larger in dollars.
To capture an agent's subjective preferences, we use the concept of utility. A utility function assigns numbers to outcomes such that outcomes with higher numbers are preferred to outcomes with lower numbers. For example, for a particular decision maker — say, one who has no money — the utility of $0 might be 0, the utility of $1 million might be 1000, and the utility of $3 million might be 1500. Thus, the expected utility (EU) of choice A is, for this decision maker, (1)(1000) = 1000. Meanwhile, the EU of choice B is (0.5)(1500) + (0.5)(0) = 750. In this case, the expected utility of choice A is greater than that of choice B, even though choice B has a greater expected monetary value.
Note that those from the field of statistics who work on decision theory tend to talk about a "loss function," which is simply an inverse utility function. For an overview of decision theory from this perspective, see Berger (1985) and Robert (2001). For a critique of some standard results in statistical decision theory, see Jaynes (2003, ch. 13).
4.2. Types of utility
An agent's utility function can't be directly observed, so it must be constructed — e.g. by asking them which options they prefer for a large set of pairs of alternatives (as on WhoIsHotter.com). The number that corresponds to an outcome's utility can convey different information depending on the utility scale in use, and the utility scale in use depends on how the utility function is constructed.
Decision theorists distinguish three kinds of utility scales:
Ordinal scales ("12 is better than 6"). In an ordinal scale, preferred outcomes are assigned higher numbers, but the numbers don't tell us anything about the differences or ratios between the utility of different outcomes.
Interval scales ("the difference between 12 and 6 equals that between 6 and 0"). An interval scale gives us more information than an ordinal scale. Not only are preferred outcomes assigned higher numbers, but also the numbers accurately reflect the difference between the utility of different outcomes. They do not, however, necessarily reflect the ratios of utility between different outcomes. If outcome A has utility 0, outcome B has utility 6, and outcome C has utility 12 on an interval scale, then we know that the difference in utility between outcomes A and B and between outcomes B and C is the same, but we can't know whether outcome B is "twice as good" as outcome A.
Ratio scales ("12 is exactly twice as valuable as 6"). Numerical utility assignments on a ratio scale give us the most information of all. They accurately reflect preference rankings, differences, and ratios. Thus, we can say that an outcome with utility 12 is exactly twice as valuable to the agent in question as an outcome with utility 6.
Note that neither experienced utility (happiness) nor the notions of "average utility" or "total utility" discussed by utilitarian moral philosophers are the same thing as the decision utility that we are discussing now to describe decision preferences. As the situation merits, we can be even more specific. For example, when discussing the type of decision utility used in an interval scale utility function constructed using Von Neumann & Morgenstern's axiomatic approach (see section 8), some people use the term VNM-utility.
Now that you know that an agent's preferences can be represented as a "utility function," and that assignments of utility to outcomes can mean different things depending on the utility scale of the utility function, we are ready to think more formally about the challenge of making "optimal" or "rational" choices. (We will return to the problem of constructing an agent's utility function later, in section 8.3.)
5. What do decision theorists mean by "risk," "ignorance," and "uncertainty"?
Peterson (2009, ch. 1) explains:
In this FAQ, a "decision under ignorance" is one in which probabilities are not assigned to all outcomes, and a "decision under uncertainty" is one in which probabilities are assigned to all outcomes. The term "risk" will be reserved for discussions related to utility.
6. How should I make decisions under ignorance?
A decision maker faces a "decision under ignorance" when she (1) knows which acts she could choose and which outcomes they may result in, but (2) is unable to assign probabilities to the outcomes.
(Note that many theorists think that all decisions under ignorance can be transformed into decisions under uncertainty, in which case this section will be irrelevant except for subsection 6.1. For details, see section 7.)
6.1. The dominance principle
To borrow an example from Peterson (2009, ch. 3), suppose that Jane isn't sure whether to order hamburger or monkfish at a new restaurant. Just about any chef can make an edible hamburger, and she knows that monkfish is fantastic if prepared by a world-class chef, but she also recalls that monkfish is difficult to cook. Unfortunately, she knows too little about this restaurant to assign any probability to the prospect of getting good monkfish. Her decision matrix might look like this:
Here, decision theorists would say that the "hamburger" choice dominates the "no main course" choice. This is because choosing the hamburger leads to a better outcome for Jane no matter which possible state of the world (good chef or bad chef) turns out to be true.
This dominance principle comes in two forms:
A comparison of strong and weak dominance
The dominance principle can also be applied to decisions under uncertainty (in which probabilities are assigned to all the outcomes). If we assign probabilities to outcomes, it is still rational to choose one act over another act if all its outcomes are at least as good as the outcomes of the other act.
However, the dominance principle only applies (non-controversially) when the agent’s acts are independent of the state of the world. So consider the decision of whether to steal a coat:
In this case, stealing the coat dominates not doing so but isn’t necessarily the rational decision. After all, stealing increases your chance of getting charged with theft and might be irrational for this reason. So dominance doesn’t apply in cases like this where the state of the world is not independent of the agents act.
On top of this, not all decision problems include an act that dominates all the others. Consequently additional principles are often required to reach a decision.
6.2. Maximin and leximin
Some decision theorists have suggested the maximin principle: if the worst possible outcome of one act is better than the worst possible outcome of another act, then the former act should be chosen. In Jane's decision problem above, the maximin principle would prescribe choosing the hamburger, because the worst possible outcome of choosing the hamburger ("edible hamburger") is better than the worst possible outcome of choosing the monkfish ("terrible monkfish") and is also better than the worst possible outcome of eating no main course ("hungry").
If the worst outcomes of two or more acts are equally good, the maximin principle tells you to be indifferent between them. But that doesn't seem right. For this reason, fans of the maximin principle often invoke the lexical maximin principle ("leximin"), which says that if the worst outcomes of two or more acts are equally good, one should choose the act for which the second worst outcome is best. (If that doesn't single out a single act, then the third worst outcome should be considered, and so on.)
Why adopt the leximin principle? Advocates point out that the leximin principle transforms a decision problem under ignorance into a decision problem under partial certainty. The decision maker doesn't know what the outcome will be, but they know what the worst possible outcome will be.
But in some cases, the leximin rule seems clearly irrational. Imagine this decision problem, with two possible acts and two possible states of the world:
In this situation, the leximin principle prescribes choosing a2. But most people would agree it is rational to risk losing out on a single cent for the chance to get an extra $10,000.
6.3. Maximax and optimism-pessimism
The maximin and leximin rules focus their attention on the worst possible outcomes of a decision, but why not focus on the best possible outcome? The maximax principle prescribes that if the best possible outcome of one act is better than the best possible outcome of another act, then the former act should be chosen.
More popular among decision theorists is the optimism-pessimism rule (aka the alpha-index rule). The optimism-pessimism rule prescribes that one consider both the best and worst possible outcome of each possible act, and then choose according to one's degree of optimism or pessimism.
Here's an example from Peterson (2009, ch. 3):
We represent the decision maker's level of optimism on a scale of 0 to 1, where 0 is maximal pessimism and 1 is maximal optimism. For a1, the worst possible outcome is 10 and the best possible outcome is 100. That is, min(a1) = 10 and max(a1) = 100. So if the decision maker is 0.85 optimistic, then the total value of a1 is (0.85)(100) + (1 - 0.85)(10) = 86.5, and the total value of a2 is (0.85)(90) + (1 - 0.85)(50) = 84. In this situation, the optimism-pessimism rule prescribes action a1.
If the decision maker's optimism is 0, then the optimism-pessimism rule collapses into the maximin rule because (0)(max(ai)) + (1 - 0)(min(ai)) = min(ai). And if the decision maker's optimism is 1, then the optimism-pessimism rule collapses into the maximax rule. Thus, the optimism-pessimism rule turns out to be a generalization of the maximin and maximax rules. (Well, sort of. The minimax and maximax principles require only that we measure value on an ordinal scale, whereas the optimism-pessimism rule requires that we measure value on an interval scale.)
The optimism-pessimism rule pays attention to both the best-case and worst-case scenarios, but is it rational to ignore all the outcomes in between? Consider this example:
The maximum and minimum values for a1 and a2 are the same, so for every degree of optimism both acts are equally good. But it seems obvious that one should choose a2.
6.4. Other decision principles
Many other decision principles for dealing with decisions under ignorance have been proposed, including minimax regret, info-gap, and maxipok. For more details on making decisions under ignorance, see Peterson (2009) and Bossert et al. (2000).
One queer feature of the decision principles discussed in this section is that they willfully disregard some information relevant to making a decision. Such a move could make sense when trying to find a decision algorithm that performs well under tight limits on available computation (Brafman & Tennenholtz (2000)), but it's unclear why an ideal agent with infinite computing power (fit for a normative rather than a prescriptive theory) should willfully disregard information.
7. Can decisions under ignorance be transformed into decisions under uncertainty?
Can decisions under ignorance be transformed into decisions under uncertainty? This would simplify things greatly, because there is near-universal agreement that decisions under uncertainty should be handled by "maximizing expected utility" (see section 11 for clarifications), whereas decision theorists still debate what should be done about decisions under ignorance.
For Bayesians (see section 10), all decisions under ignorance are transformed into decisions under uncertainty (Winkler 2003, ch. 5) when the decision maker assigns an "ignorance prior" to each outcome for which they don't know how to assign a probability. (Another way of saying this is to say that a Bayesian decision maker never faces a decision under ignorance, because a Bayesian must always assign a prior probability to events.) One must then consider how to assign priors, an important debate among Bayesians (see section 10).
Many non-Bayesian decision theorists also think that decisions under ignorance can be transformed into decisions under uncertainty due to something called the principle of insufficient reason. The principle of insufficient reason prescribes that if you have literally no reason to think that one state is more probable than another, then one should assign equal probability to both states.
One objection to the principle of insufficient reason is that it is very sensitive to how states are individuated. Peterson (2009, ch. 3) explains:
An objection to the principle of insufficient reason
Advocates of the principle of insufficient reason might respond that one must consider symmetric states. For example if someone gives you a die with n sides and you have no reason to think the die is biased, then you should assign a probability of 1/n to each side. But, Peterson notes:
8. How should I make decisions under uncertainty?
A decision maker faces a "decision under uncertainty" when she (1) knows which acts she could choose and which outcomes they may result in, and she (2) assigns probabilities to the outcomes.
Decision theorists generally agree that when facing a decision under uncertainty, it is rational to choose the act with the highest expected utility. This is the principle of expected utility maximization (EUM).
Decision theorists offer two kinds of justifications for EUM. The first has to do with the law of large numbers (see section 8.1). The second has to do with the axiomatic approach (see sections 8.2 through 8.6).
8.1. The law of large numbers
The "law of large numbers," which states that in the long run, if you face the same decision problem again and again and again, and you always choose the act with the highest expected utility, then you will almost certainly be better off than if you choose any other acts.
There are two problems with using the law of large numbers to justify EUM. The first problem is that the world is ever-changing, so we rarely if ever face the same decision problem "again and again and again." The law of large numbers says that if you face the same decision problem infinitely many times, then the probability that you could do better by not maximizing expected utility approaches zero. But you won't ever face the same decision problem infinitely many times! Why should you care what would happen if a certain condition held, if you know that condition will never hold?
The second problem with using the law of large numbers to justify EUM has to do with a mathematical theorem known as gambler's ruin. Imagine that you and I flip a fair coin, and I pay you $1 every time it comes up heads and you pay me $1 every time it comes up tails. We both start with $100. If we flip the coin enough times, one of us will face a situation in which the sequence of heads or tails is longer than we can afford. If a long-enough sequence of heads comes up, I'll run out of $1 bills with which to pay you. If a long-enough sequence of tails comes up, you won't be able to pay me. So in this situation, the law of large numbers guarantees that you will be better off in the long run by maximizing expected utility only if you start the game with an infinite amount of money (so that you never go broke), which is an unrealistic assumption. (For technical convenience, assume utility increases linearly with money. But the basic point holds without this assumption.)
8.2. The axiomatic approach
The other method for justifying EUM seeks to show that EUM can be derived from axioms that hold regardless of what happens in the long run.
In this section we will review perhaps the most famous axiomatic approach, from Von Neumann and Morgenstern (1947). Other axiomatic approaches include Savage (1954), Jeffrey (1983), and Anscombe & Aumann (1963).
8.3. The Von Neumann-Morgenstern utility theorem
The first decision theory axiomatization appeared in an appendix to the second edition of Von Neumann & Morgenstern's Theory of Games and Economic Behavior (1947). An important point to note up front is that, in this axiomatization, Von Neumann and Morgenstern take the options that the agent chooses between to not be acts, as we’ve defined them, but lotteries (where a lottery is a set of outcomes, each paired with a probability). As such, while discussing their axiomatization, we will talk of lotteries. (Despite making this distinction, acts and lotteries are closely related. Under the conditions of uncertainty that we are considering here, each act will be associated with some lottery and so preferences over lotteries could be used to determine preferences over acts, if so desired).
The key feature of the Von Neumann and Morgenstern axiomatization is a proof that if a decision maker states her preferences over a set of lotteries, and if her preferences conform to a set of intuitive structural constraints (axioms), then we can construct a utility function (on an interval scale) from her preferences over lotteries and show that she acts as if she maximizes expected utility with respect to that utility function.
What are the axioms to which an agent's preferences over lotteries must conform? There are four of them.
The completeness axiom states that the agent must bother to state a preference for each pair of lotteries. That is, the agent must prefer A to B, or prefer B to A, or be indifferent between the two.
The transitivity axiom states that if the agent prefers A to B and B to C, she must also prefer A to C.
The independence axiom states that, for example, if an agent prefers an apple to an orange, then she must also prefer the lottery [55% chance she gets an apple, otherwise she gets cholera] over the lottery [55% chance she gets an orange, otherwise she gets cholera]. More generally, this axiom holds that a preference must hold independently of the possibility of another outcome (e.g. cholera).
The continuity axiom holds that if the agent prefers A to B to C, then there exists a unique p (probability) such that the agent is indifferent between [p(A) + (1 - p)(C)] and [outcome B with certainty].
The continuity axiom requires more explanation. Suppose that A = $1 million, B = $0, and C = Death. If p = 0.5, then the agent's two lotteries under consideration for the moment are:
Most people would not be indifferent between $0 with certainty and [50% chance of $1M, 50% chance of Death] — the risk of Death is too high! But if you have continuous preferences, there is some probability p for which you'd be indifferent between these two lotteries. Perhaps p is very, very high:
Perhaps now you'd be indifferent between lottery 1 and lottery 2. Or maybe you'd be more willing to risk Death for the chance of winning $1M, in which case the p for which you'd be indifferent between lotteries 1 and 2 is lower than 0.999999. As long as there is some p at which you'd be indifferent between lotteries 1 and 2, your preferences are "continuous."
Given this setup, Von Neumann and Morgenstern proved their theorem, which states that if the agent's preferences over lotteries obeys their axioms, then:
8.4. VNM utility theory and rationality
An agent which conforms to the VNM axioms is sometimes said to be "VNM-rational." But why should "VNM-rationality" constitute our notion of rationality in general? How could VNM's result justify the claim that a rational agent maximizes expected utility when facing a decision under uncertainty? The argument goes like this:
Von Neumann and Morgenstern proved premise 2, and the conclusion follows from premise 1 and 2. But why accept premise 1?
Few people deny that it would be irrational for an agent to choose a lottery which it does not prefer. But why is it irrational for an agent's preferences to violate the VNM axioms? I will save that discussion for section 8.6.
8.5. Objections to VNM-rationality
Several objections have been raised to Von Neumann and Morgenstern's result:
The VNM axioms are too strong. Some have argued that the VNM axioms are not self-evidently true. See section 8.6.
The VNM system offers no action guidance. A VNM-rational decision maker cannot use VNM utility theory for action guidance, because she must state her preferences over lotteries at the start. But if an agent can state her preferences over lotteries, then she already knows which lottery to choose. (For more on this, see section 9.)
In the VNM system, utility is defined via preferences over lotteries rather than preferences over outcomes. To many, it seems odd to define utility with respect to preferences over lotteries. Many would argue that utility should be defined in relation to preferences over outcomes or world-states, and that's not what the VNM system does. (Also see section 9.)
8.6. Should we accept the VNM axioms?
The VNM preference axioms define what it is for an agent to be VNM-rational. But why should we accept these axioms? Usually, it is argued that each of the axioms are pragmatically justified because an agent which violates the axioms can face situations in which they are guaranteed end up worse off (from their own perspective).
In sections 8.6.1 and 8.6.2 I go into some detail about pragmatic justifications offered for the transitivity and completeness axioms. For more detail, including arguments about the justification of the other axioms, see Peterson (2009, ch. 8) and Anand (1993).
8.6.1. The transitivity axiom
Consider the money-pump argument in favor of the transitivity axiom ("if the agent prefers A to B and B to C, she must also prefer A to C").
An example of a money-pump argument
Similar arguments (e.g. Gustafsson 2010) aim to show that the other kind of intransitive preferences (acyclic preferences) are irrational, too.
(Of course, pragmatic arguments need not be framed in monetary terms. We could just as well construct an argument showing that an agent with intransitive preferences can be "pumped" of all their happiness, or all their moral virtue, or all their Twinkies.)
8.6.2. The completeness axiom
The completeness axiom ("the agent must prefer A to B, or prefer B to A, or be indifferent between the two") is often attacked by saying that some goods or outcomes are incommensurable — that is, they cannot be compared. For example, must a rational agent be able to state a preference (or indifference) between money and human welfare?
Perhaps the completeness axiom can be justified with a pragmatic argument. If you think it is rationally permissible to swap between two incommensurable goods, then one can construct a money pump argument in favor of the completeness axiom. But if you think it is not rational to swap between incommensurable goods, then one cannot construct a money pump argument for the completeness axiom. (In fact, even if it is rational to swap between incommensurable goods, Mandler, 2005 has demonstrated that an agent that allows their current choices to depend on the previous ones can avoid being money pumped.)
And in fact, there is a popular argument against the completeness axiom: the "small improvement argument." For details, see Chang (1997) and Espinoza (2007).
Note that in revealed preference theory, according to which preferences are revealed through choice behavior, there is no room for incommensurable preferences because every choice always reveals a preference relation of "better than," "worse than," or "equally as good as."
Another proposal for dealing with the apparent incommensurability of some goods (such as money and human welfare) is the multi-attribute approach:
For more on the multi-attribute approach, see Keeney & Raiffa (1993).
8.6.3. The Allais paradox
Having considered the transitivity and completeness axioms, we can now turn to independence (a preference holds independently of considerations of other possible outcomes). Do we have any reason to reject this axiom? Here’s one reason to think we might: in a case known as the Allais paradox Allais (1953) it may seem reasonable to act in a way that contradicts independence.
The Allais paradox asks us to consider two decisions (this version of the paradox is based on Yudkowsky (2008)).The first decision involves the choice between:
(1A) A certain $24,000; and (1B) A 33/34 chance of $27,000 and a 1/34 chance of nothing.
The second involves the choice between:
(2A) A 34% chance of $24, 000 and a 66% chance of nothing; and (2B) A 33% chance of $27, 000 and a 67% chance of nothing.
Experiments have shown that many people prefer (1A) to (1B) and (2B) to (2A). However, these preferences contradict independence. Option 2A is the same as [a 34% chance of option 1A and a 66% chance of nothing] while 2B is the same as [a 34% chance of option 1B and a 66% chance of nothing]. So independence implies that anyone that prefers (1A) to (1B) must also prefer (2A) to (2B).
When this result was first uncovered, it was presented as evidence against the independence axiom. However, while the Allais paradox clearly reveals that independence fails as a descriptive account of choice, it’s less clear what it implies about the normative account of rational choice that we are discussing in this document. As noted in Peterson (2009, ch. 4), however:
However, Peterson then goes on to note that, many people, like the statistician Leonard Savage, argue that it is people’s preference in the Allais paradox that are in error rather than the independence axiom. If so, then the paradox seems to reveal the danger of relying too strongly on intuition to determine the form that should be taken by normative theories of rational.
8.6.4. The Ellsberg paradox
The Allais paradox is far from the only case where people fail to act in accordance with EUM. Another well-known case is the Ellsberg paradox (the following is taken from Resnik (1987):
If we guess the correct color, we will receive a payout of $100. In the Ellsberg paradox, many people bet yellow in situation A and red or blue in situation B. Further, many people make these decisions not because they are indifferent in both situations, and so happy to choose either way, but rather because they have a strict preference to choose in this manner.
The Ellsberg paradox
However, such behavior cannot be in accordance with EUM. In order for EUM to endorse a strict preference for choosing yellow in situation A, the agent would have to assign a probability of more than 1/3 to the ball selected being blue. On the other hand, in order for EUM to endorse a strict preference for choosing red or blue in situation B the agent would have to assign a probability of less than 1/3 to the selected ball being blue. As such, these decisions can’t be jointly endorsed by an agent following EUM.
Those who deny that decisions making under ignorance can be transformed into decision making under uncertainty have an easy response to the Ellsberg paradox: as this case involves deciding under a situation of ignorance, it is irrelevant whether people’s decisions violate EUM in this case as EUM is not applicable to such situations.
Those who believe that EUM provides a suitable standard for choice in such situations, however, need to find some other way of responding to the paradox. As with the Allais paradox, there is some disagreement about how best to do so. Once again, however, many people, including Leonard Savage, argue that EUM reaches the right decision in this case. It is our intuitions that are flawed (see again Resnik (1987) for a nice summary of Savage’s argument to this conclusion).
8.6.5. The St Petersburg paradox
Another objection to the VNM approach (and to expected utility approaches generally), the St. Petersburg paradox, draws on the possibility of infinite utilities. The St. Petersburg paradox is based around a game where a fair coin is tossed until it lands heads up. At this point, the agent receives a prize worth 2n utility, where n is equal to the number of times the coin was tossed during the game. The so-called paradox occurs because the expected utility of choosing to play this game is infinite and so, according to a standard expected utility approach, the agent should be willing to pay any finite amount to play the game. However, this seems unreasonable. Instead, it seems that the agent should only be willing to pay a relatively small amount to do so. As such, it seems that the expected utility approach gets something wrong.
Various responses have been suggested. Most obviously, we could say that the paradox does not apply to VNM agents, since the VNM theorem assigns real numbers to all lotteries, and infinity is not a real number. But it's unclear whether this escapes the problem. After all, at it's core, the St. Petersburg paradox is not about infinite utilities but rather about cases where expected utility approaches seem to overvalue some choice, and such cases seem to exist even in finite cases. For example, if we let L be a finite limit on utility we could consider the following scenario (from Peterson, 2009, p. 85):
In this case, even if an extremely low value is set for L, it seems that paying this amount to play the game is unreasonable. After all, as Peterson notes, about nine times out of ten an agent that plays this game will win no more than 8 · 10-100 utility. If paying 1 utility is, in fact, unreasonable in this case, then simply limiting an agent's utility to some finite value doesn't provide a defence of expected utility approaches. (Other problems abound. See Yudkowsky, 2007 for an interesting finite problem and Nover & Hajek, 2004 for a particularly perplexing problem with links to the St Petersburg paradox.)
As it stands, there is no agreement about precisely what the St Petersburg paradox reveals. Some people accept one of the various resolutions of the case and so find the paradox unconcerning. Others think the paradox reveals a serious problem for expected utility theories. Still others think the paradox is unresolved but don't think that we should respond by abandoning expected utility theory.
9. Does axiomatic decision theory offer any action guidance?
For the decision theories listed in section 8.2, it's often claimed the answer is "no." To explain this, I must first examine some differences between direct and indirect approaches to axiomatic decision theory.
Peterson (2009, ch. 4) explains:
The axiomatic decision theories listed in section 8.2 all follow the indirect approach. These theories, it might be said, cannot offer any action guidance because they require an agent to state its preferences over acts "up front." But an agent that states its preferences over acts already knows which act it prefers, so the decision theory can't offer any action guidance not already present in the agent's own stated preferences over acts.
Peterson (2009, ch .10) gives a practical example:
Thus, it seems only the direct approach offers an agent any action guidance. But the direct approach is very recent (Peterson 2008; Cozic 2011), and only time will show whether it can stand up to professional criticism.
Warning: Peterson's (2008) direct approach is confusingly called "non-Bayesian decision theory" despite assuming Bayesian probability theory.
For other attempts to pull action guidance from normative decision theory, see Fallenstein (2012) and Stiennon (2013).
10. How does probability theory play a role in decision theory?
In order to calculate the expected utility of an act (or lottery), it is necessary to determine a probability for each outcome. In this section, I will explore some of the details of probability theory and its relationship to decision theory.
For further introductory material to probability theory, see Howson & Urbach (2005), Grimmet & Stirzacker (2001), and Koller & Friedman (2009). This section draws heavily on Peterson (2009, chs. 6 & 7) which provides a very clear introduction to probability in the context of decision theory.
10.1. The basics of probability theory
Intuitively, a probability is a number between 0 or 1 that labels how likely an event is to occur. If an event has probability 0 then it is impossible and if it has probability 1 then it can't possibly be false. If an event has a probability between these values, then this event it is more probable the higher this number is.
As with EUM, probability theory can be derived from a small number of simple axioms. In the probability case, there are three of these, which are named the Kolmogorov axioms after the mathematician Andrey Kolmogorov. The first of these states that probabilities are real numbers between 0 and 1. The second, that if a set of events are mutually exclusive and exhaustive then their probabilities should sum to 1. The third that if two events are mutually exclusive then the probability that one or the other of these events will occur is equal to the sum of their individual probabilities.
From these three axioms, the remainder of probability theory can be derived. In the remainder of this section, I will explore some aspects of this broader theory.
10.2. Bayes theorem for updating probabilities
From the perspective of decision theory, one particularly important aspect of probability theory is the idea of a conditional probability. These represent how probable something is given a piece of information. So, for example, a conditional probability could represent how likely it is that it will be raining, conditioning on the fact that the weather forecaster predicted rain. A powerful technique for calculating conditional probabilities is Bayes theorem (see Yudkowsky, 2003 for a detailed introduction). This formula states that:
P(A|B)=(P(B|A)P(A))/P(B)
Bayes theorem is used to calculate the probability of some event, A, given some evidence, B. As such, this formula can be used to update probabilities based on new evidence. So if you are trying to predict the probability that it will rain tomorrow and someone gives you the information that the weather forecaster predicted that it will do so then this formula tells you how to calculate a new probability that it will rain based on your existing information. The initial probability in such cases (before the information is factored into account) is called the prior probability and the result of applying Bayes theorem is a new, posterior probability.
Using Bayes theorem to update probabilities based on the evidence provided by a weather forecast
Bayes theorem can be seen as solving the problem of how to update prior probabilities based on new information. However, it leaves open the question of how to determine the prior probability in the first place. In some cases, there will be no obvious way to do so. One solution to this problem suggests that any reasonable prior can be selected. Given enough evidence, repeated applications of Bayes theorem will lead this prior probability to be updated to much the same posterior probability, even for people with widely different initial priors. As such, the initially selected prior is less crucial than it may at first seem.
10.3. How should probabilities be interpreted?
There are two main views about what probabilities mean: objectivism and subjectivism. Loosely speaking, the objectivist holds that probabilities tell us something about the external world while the subjectivist holds that they tell us something about our beliefs. Most decision theorists hold a subjectivist view about probability. According to this sort of view, probabilities represent a subjective degrees of belief. So to say the probability of rain is 0.8 is to say that the agent under consideration has a high degree of belief that it will rain (see Jaynes, 2003 for a defense of this view). Note that, according to this view, another agent in the same circumstance could assign a different probability that it will rain.
10.3.1. Why should degrees of belief follow the laws of probability?
One question that might be raised against the subjective account of probability is why, on this account, our degrees of belief should satisfy the Kolmogorov axioms. For example, why should our subjective degrees of belief in mutually exclusive, exhaustive events add to 1? One answer to this question shows that agents whose degrees of belief don’t satisfy these axioms will be subject to Dutch Book bets. These are bets where the agent will inevitably lose money. Peterson (2009, ch. 7) explains:
A Dutch Book argument
It can be proven that an agent is subject to Dutch Book bets if, and only if, their degrees of belief violate the axioms of probability. This provides an argument for why degrees of beliefs should satisfy these axioms.
10.3.2. Measuring subjective probabilities
Another challenges raised by the subjective view is how we can measure probabilities. If these represent subjective degrees of belief there doesn’t seem to be an easy way to determine these based on observations of the world. However, a number of responses to this problem have been advanced, one of which is explained succinctly by Peterson (2009, ch. 7):
Other theorists, for example DeGroot (1970), propose other approaches:
11. What about "Newcomb's problem" and alternative decision algorithms?
Saying that a rational agent "maximizes expected utility" is, unfortunately, not specific enough. There are a variety of decision algorithms which aim to maximize expected utility, and they give different answers to some decision problems, for example "Newcomb's problem."
In this section, we explain these decision algorithms and show how they perform on Newcomb's problem and related "Newcomblike" problems.
General sources on this topic include: Campbell & Sowden (1985), Ledwig (2000), Joyce (1999), and Yudkowsky (2010). Moertelmaier (2013) discusses Newcomblike problems in the context of the agent-environment framework.
11.1. Newcomblike problems and two decision algorithms
I'll begin with an exposition of several Newcomblike problems, so that I can refer to them in later sections. I'll also introduce our first two decision algorithms, so that I can show how one's choice of decision algorithm affects an agent's outcomes on these problems.
11.1.1. Newcomb's Problem
Newcomb's problem was formulated by the physicist William Newcomb but first published in Nozick (1969). Below I present a version of it inspired by Yudkowsky (2010).
A superintelligent machine named Omega visits Earth from another galaxy and shows itself to be very good at predicting events. This isn't because it has magical powers, but because it knows more science than we do, has billions of sensors scattered around the globe, and runs efficient algorithms for modeling humans and other complex systems with unprecedented precision — on an array of computer hardware the size of our moon.
Omega presents you with two boxes. Box A is transparent and contains $1000. Box B is opaque and contains either $1 million or nothing. You may choose to take both boxes (called "two-boxing"), or you may choose to take only box B (called "one-boxing"). If Omega predicted you'll two-box, then Omega has left box B empty. If Omega predicted you'll one-box, then Omega has placed $1M in box B.
By the time you choose, Omega has already left for its next game — the contents of box B won't change after you make your decision. Moreover, you've watched Omega play a thousand games against people like you, and on every occasion Omega predicted the human player's choice accurately.
Should you one-box or two-box?
Newcomb’s problem
Here's an argument for two-boxing. The $1M either is or is not in the box; your choice cannot affect the contents of box B now. So, you should two-box, because then you get $1K plus whatever is in box B. This is a straightforward application of the dominance principle (section 6.1). Two-boxing dominantes one-boxing.
Convinced? Well, here's an argument for one-boxing. On all those earlier games you watched, everyone who two-boxed received $1K, and everyone who one-boxed received $1M. So you're almost certain that you'll get $1K for two-boxing and $1M for one-boxing, which means that to maximize your expected utility, you should one-box.
Nozick (1969) reports:
This is not a "merely verbal" dispute (Chalmers 2011). Decision theorists have offered different algorithms for making a choice, and they have different outcomes. Translated into English, the first algorithm (evidential decision theory or EDT) says "Take actions such that you would be glad to receive the news that you had taken them." The second algorithm (causal decision theory or CDT) says "Take actions which you expect to have a positive effect on the world."
Many decision theorists have the intuition that CDT is right. But a CDT agent appears to "lose" on Newcomb's problem, ending up with $1000, while an EDT agent gains $1M. Proponents of EDT can ask proponents of CDT: "If you're so smart, why aren't you rich?" As Spohn (2012) writes, "this must be poor rationality that complains about the reward for irrationality." Or as Yudkowsky (2010) argues:
In response to EDT's apparent "win" over CDT on Newcomb's problem, proponents of CDT have presented similar problems on which a CDT agent "wins" and an EDT agent "loses." Proponents of EDT, meanwhile, have replied with additional Newcomblike problems on which EDT wins and CDT loses. Let's explore each of them in turn.
11.1.2. Evidential and causal decision theory
First, however, we will consider our two decision algorithms in a little more detail.
EDT can be described simply: according to this theory, agents should use conditional probabilities when determining the expected utility of different acts. Specifically, they should use the probability of the world being in each possible state conditioning on them carrying out the act under consideration. So in Newcomb’s problem they consider the probability that Box B contains $1 million or nothing conditioning on the evidence provided by their decision to one-box or two-box. This is how the theory formalizes the notion of an act providing good news.
CDT is more complex, at least in part because it has been formulated in a variety of different ways and these formulations are equivalent to one another only if certain background assumptions are met. However, a good sense of the theory can be gained by considering the counterfactual approach, which is one of the more intuitive of these formulations. This approach utilizes the probabilities of certain counterfactual conditionals, which can be thought of as representing the causal influence of an agent’s acts on the state of the world. These conditionals take the form “if I were to carry out a certain act, then the world would be in a certain state." So in Newcomb’s problem, for example, this formulation of CDT considers the probability of the counterfactuals like “if I were to one-box, then Box B would contain $1 million” and, in doing so, considers the causal influence of one-boxing on the contents of the boxes.
The same distinction can be made in formulaic terms. Both EDT and CDT agree that decision theory should be about maximizing expected utility where the expected utility of an act, A, given a set of possible outcomes, O, is defined as follows:
.
In this equation, V(A & O) represents the value to the agent of the combination of an act and an outcome. So this is the utility that the agent will receive if they carry out a certain act and a certain outcome occurs. Further, PrAO represents the probability of each outcome occurring on the supposition that the agent carries out a certain act. It is in terms of this probability that CDT and EDT differ. EDT uses the conditional probability, Pr(O|A), while CDT uses the probability of subjunctive conditionals, Pr(A O).
Using these two versions of the expected utility formula, it's possible to demonstrate in a formal manner why EDT and CDT give the advice they do in Newcomb's problem. To demonstrate this it will help to make two simplifying assumptions. First, we will presume that each dollar of money is worth 1 unit of utility to the agent (and so will presume that the agent's utility is linear with money). Second, we will presume that Omega is a perfect predictor of human actions so that if the agent two-boxes it provides definitive evidence that there is nothing in the opaque box and if the agent one-boxes it provides definitive evidence that there is $1 million in this box. Given these assumptions, EDT calculates the expected utility of each decision as follows:
EU for two-boxing according to EDT
EU for one-boxing according to EDT
Given that one-boxing has a higher expected utility according to these calculations, an EDT agent will one-box.
On the other hand, given that the agent's decision doesn't causally influence Omega's earlier prediction, CDT will use the same probability regardless of whether you one or two box. The decision endorsed will be the same regardless of what probability we use so, to demonstrate the theory, we can simply arbitrarily assign an 0.5 probability that the opaque box has nothing in it and an 0.5 probability that it has one million dollars in it. CDT then calculates the expected utility of each decision as follows:
EU for two-boxing according to CDT
EU for one-boxing according to CDT
Given that two-boxing has a higher expected utility according to these calculations, a CDT agent will two-box. This approach demonstrates the result given more informally in the previous section: CDT agents will two-box in Newcomb's problem and EDT agents will one box.
As mentioned before, there are also alternative formulations of CDT. What are these? For example, David Lewis (1981) and Brian Skyrms (1980) both present approaches that rely on the partition of the world into states to capture causal information, rather than counterfactual conditionals. On Lewis’s version of this account, for example, the agent calculates the expected utility of acts using their unconditional credence in states of the world that are dependency hypotheses, which are descriptions of the possible ways that the world can depend on the agent’s actions. These dependency hypotheses intrinsically contain the required causal information.
Other traditional approaches to CDT include the imaging approach of Sobel (1980) (also see Lewis 1981) and the unconditional expectations approach of Leonard Savage (1954). Those interested in the various traditional approaches to CDT would be best to consult Lewis (1981), Weirich (2008), and Joyce (1999). More recently, work in computer science on a tool called causal Bayesian networks has led to an innovative approach to CDT that has received some recent attention in the philosophical literature (Pearl 2000, ch. 4 and Spohn 2012).
Now we return to an analysis of decision scenarios, armed with EDT and the counterfactual formulation of CDT.
11.1.3. Medical Newcomb problems
Medical Newcomb problems share a similar form but come in many variants, including Solomon's problem (Gibbard & Harper 1976) and the smoking lesion problem (Egan 2007). Below I present a variant called the "chewing gum problem" (Yudkowsky 2010):
One response to this claim, called the tickle defense (Eells, 1981), argues that EDT actually reaches the right decision in such cases. According to this defense, the most reasonable way to construe the “chewing gum problem” involves presuming that CGTA causes a desire (a mental “tickle”) which then causes the agent to be more likely to chew gum, rather than CGTA directly causing the action. Given this, if we presume that the agent already knows their own desires and hence already knows whether they’re likely to have the CGTA gene, chewing gum will not provide the agent with further bad news. Consequently, an agent following EDT will chew in order to get the good news that they have decreased their chance of getting abscesses.
Unfortunately, the tickle defense fails to achieve its aims. In introducing this approach, Eells hoped that EDT could be made to mimic CDT but without an allegedly inelegant reliance on causation. However, Sobel (1994, ch. 2) demonstrated that the tickle defense failed to ensure that EDT and CDT would decide equivalently in all cases. On the other hand, those who feel that EDT originally got it right by one-boxing in Newcomb’s problem will be disappointed to discover that the tickle defense leads an agent to two-box in some versions of Newcomb’s problem and so solves one problem for the theory at the expense of introducing another.
So just as CDT “loses” on Newcomb’s problem, EDT will "lose” on Medical Newcomb problems (if the tickle defense fails) or will join CDT and "lose" on Newcomb’s Problem itself (if the tickle defense succeeds).
11.1.4. Newcomb's soda
There are also similar problematic cases for EDT where the evidence provided by your decision relates not to a feature that you were born (or created) with but to some other feature of the world. One such scenario is the Newcomb’s soda problem, introduced in Yudkowsky (2010):
Newcomb’s soda
In this case, an EDT agent will decide to eat chocolate ice cream as this would provide evidence that they drank the chocolate soda and hence that they will receive $1 million after the experiment. However, this seems to be the wrong decision and so, once again, the EDT agent “loses”.
11.1.5. Bostrom's meta-Newcomb problem
In response to attacks on their theory, the proponent of EDT can present alternative scenarios where EDT “wins” and it is CDT that “loses”. One such case is the meta-Newcomb problem proposed in Bostrom (2001). Adapted to fit my earlier story about Omega the superintelligent machine (section 11.1.1), the problem runs like this: Either Omega has already placed $1M or nothing in box B (depending on its prediction about your choice), or else Omega is watching as you choose and after your choice it will place $1M into box B only if you have one-boxed. But you don't know which is the case. Omega makes its move before the human player's choice about half the time, and the rest of the time it makes its move after the player's choice.
But now suppose there is another superintelligent machine, Meta-Omega, who has a perfect track record of predicting both Omega's choices and the choices of human players. Meta-Omega tells you that either you will two-box and Omega will "make its move" after you make your choice, or else you will one-box and Omega has already made its move (and gone on to the next game, with someone else).
Here, an EDT agent one-boxes and walks away with a million dollars. On the face of it, however, a CDT agent faces a dilemma: if she two-boxes then Omega's action depends on her choice, so the "rational" choice is to one-box. But if the CDT agent one-boxes, then Omega's action temporally precedes (and is thus physically independent of) her choice, so the "rational" action is to two-box. It might seem, then, that a CDT agent will be unable to reach any decision in this scenario. However, further reflection reveals that the issue is more complicated. According to CDT, what the agent ought to do in this scenario depends on their credences about their own actions. If they have a high credence that they will two-box, they ought to one-box and if they have a high credence that they will one-box, they ought to two box. Given that the agent's credences in their actions are not given to us in the description of the meta-Newcomb problem, the scenario is underspecified and it is hard to know what conclusions should be drawn from it.
11.1.6. The psychopath button
Fortunately, another case has been introduced where, according to CDT, what an agent ought to do depends on their credences about what they will do. This is the psychopath button, introduced in Egan (2007):
Many people think Paul should not. After all, if he does so, he is almost certainly a psychopath and so pressing the button will almost certainly cause his death. This is also the response that an EDT agent will give. After all, pushing the button would provide the agent with the bad news that they are almost certainly a psychopath and so will die as a result of their action.
On the other hand, if Paul is fairly certain that he is not a psychopath, then CDT will say that he ought to press the button. CDT will note that, given Paul’s confidence that he isn’t a psychopath, his decision will almost certainly have a positive impact as it will result in the death of all psychopaths and Paul’s survival. On the face of it, then, a CDT agent would decide inappropriately in this case by pushing the button. Importantly, unlike in the meta-Newcomb problem, the agent's credences about their own behavior are specified in Egan's full version of this scenario (in non-numeric terms, the agent thinks they're unlikely to be a psychopath and hence unlikely to press the button).
However, in order to produce this problem for CDT, Egan made a number of assumptions about how an agent should decide when what they ought to do depends on what they think they will do. In response, alternative views about deciding in such cases have been advanced (particular in Arntzenius, 2008 and Joyce, 2012). Given these factors, opinions are split about whether the psychopath button problem does in fact pose a challenge to CDT.
11.1.7. Parfit's hitchhiker
Not all decision scenarios are problematic for just one of EDT or CDT. There are also cases that can be presented where both an EDT agent and a CDT agent will both "lose". One such case is Parfit’s Hitchhiker (Parfit, 1984, p. 7):
In this scenario the agent "loses" if they would later refuse to give the stranger the reward. However, both EDT agents and CDT agents will refuse to do so. After all, by this point the agent will already be safe so giving the reward can neither provide good news about, nor cause, their safety. So this seems to be a case where both theories “lose”.
11.1.8. Transparent Newcomb's problem
There are also other cases where both EDT and CDT "lose". One of these is the Transparent Newcomb's problem which, in at least one version, is due to Drescher (2006, p. 238-242). This scenario is like the original Newcomb's problem but, in this case, both boxes are transparent so you can see their contents when you make your decision. Again, Omega has filled box A with $1000 and Box B with either $1 million or nothing based on a prediction of your behavior. Specifically, Omega has predicted how you would decide if you witnessed $1 million in Box B. If Omega predicted that you would one-box in this case, he placed $1 million in Box B. On the other hand, if Omega predicted that you would two-box in this case then he placed nothing in Box B.
Both EDT and CDT agents will two-box in this case. After all, the contents of the boxes are determined and known so the agent's decision can neither provide good news about what they contain nor cause them to contain something desirable. As with two-boxing in the original version of Newcomb’s problem, many philosophers will endorse this behavior.
However, it’s worth noting that Omega will almost certainly have predicted this decision and so filled Box B with nothing. CDT and EDT agents will end up with $1000. On the other hand, just as in the original case, the agent that one-boxes will end up with $1 million. So this is another case where both EDT and CDT “lose”. Consequently, to those that agree with the earlier comments (in section 11.1.1) that a decision theory shouldn't lead an agent to "lose", neither of these theories will be satisfactory.
11.1.9. Counterfactual mugging
Another similar case, known as counterfactual mugging, was developed in Nesov (2009):
Should you give up the $100?
Both CDT and EDT say no. After all, giving up your money neither provides good news about nor influences your chances of getting $10 000 out of the exchange. Further, this intuitively seems like the right decision. On the face of it, then, it is appropriate to retain your money in this case.
However, presuming you take Omega to be perfectly trustworthy, there seems to be room to debate this conclusion. If you are the sort of agent that gives up the $100 in counterfactual mugging then you will tend to do better than the sort of agent that won’t give up the $100. Of course, in the particular case at hand you will lose but rational agents often lose in specific cases (as, for example, when such an agent loses a rational bet). It could be argued that what a rational agent should not do is be the type of agent that loses. Given that agents that refuse to give up the $100 are the type of agent that loses, there seem to be grounds to claim that counterfactual mugging is another case where both CDT and EDT act inappropriately.
11.1.10. Prisoner's dilemma
Before moving on to a more detailed discussion of various possible decision theories, I’ll consider one final scenario: the prisoner’s dilemma. Resnik (1987, pp. 147-148 ) outlines this scenario as follows:
The decision matrix of each vandal will be as follows:
Faced with this scenario, a CDT agent will confess. After all, the agent’s decision can’t influence their partner’s decision (they’ve been isolated from one another) and so the agent is better off confessing regardless of what their partner chooses to do. According to the majority of decision (and game) theorists, confessing is in fact the rational decision in this case.
Despite this, however, an EDT agent may lie in a prisoner’s dilemma. Specifically, if they think that their partner is similar enough to them, the agent will lie because doing so will provide the good news that they will both lie and hence that they will both get two years in jail (good news as compared with the bad news that they will both confess and hence that they will get 10 years in jail).
To many people, there seems to be something compelling about this line of reasoning. For example, Douglas Hofstadter (1985, pp. 737-780) has argued that an agent acting “superrationally” would co-operate with other superrational agents for precisely this sort of reason: a superrational agent would take into account the fact that other such agents will go through the same thought process in the prisoner’s dilemma and so make the same decision. As such, it is better that that the decision that both agents reach be to lie than that it be to confess. More broadly, it could perhaps be argued that a rational agent should lie in the prisoner’s dilemma as long as they believe that they are similar enough to their partner that they are likely to reach the same decision.
An argument for cooperation in the prisoners’ dilemma
It is unclear, then, precisely what should be concluded from the prisoner’s dilemma. However, for those that are sympathetic to Hofstadter’s point or the line of reasoning appealed to by the EDT agent, the scenario seems to provide an additional reason to seek out an alternative theory to CDT.
11.2. Benchmark theory (BT)
One recent response to the apparent failure of EDT to decide appropriately in medical Newcomb problems and CDT to decide appropriately in the psychopath button is Benchmark Theory (BT) which was developed in Wedgwood (2011) and discussed further in Briggs (2010).
In English, we could think of this decision algorithm as saying that agents should decide so as to give their future self good news about how well off they are compared to how well off they could have been. In formal terms, BT uses the following formula to calculate the expected utility of an act, A:
.
In other words, it uses the conditional probability, as in EDT but calculates the value differently (as indicated by the use of V’ rather than V). V’ is calculated relative to a benchmark value in order to give a comparative measure of value (both of the above sources go into more detail about this process).
Taking the informal perspective, in the chewing gum problem, BT will note that by chewing gum, the agent will always get the good news that they are comparatively better off than they could have been (because chewing gum helps control throat abscesses) whereas by not chewing, the agent will always get the bad news that they could have been comparatively better off by chewing. As such, a BT agent will chew in this scenario.
Further, BT seems to reach what many consider to be the right decision in the psychopath button. In this case, the BT agent will note that if they push the button they will get the bad news that they are almost certainly a psychopath and so that they would have been comparatively much better off by not pushing (as pushing will kill them). On the other hand, if they don’t push they will get the less bad news that they are almost certainly not a psychopath and so could have been comparatively a little better off it they had pushed the button (as this would have killed all the psychopaths but not them). So refraining from pushing the button gives the less bad news and so is the rational decision.
On the face of it, then, there seem to be strong reasons to find BT compelling: it decides appropriately in these scenarios while, according to some people, EDT and CDT only decide appropriately in one or the other of them.
Unfortunately, a BT agent will fail to decide appropriately in other scenarios. First, those that hold that one-boxing is the appropriate decision in Newcomb’s problem will immediately find a flaw in BT. After all, in this scenario two-boxing gives the good news that the agent did comparatively better than they could have done (because they gain the $1000 from Box A which is more than they would have received otherwise) while one-boxing brings the bad news that they did comparatively worse than they could have done (as they did not receive this money). As such, a BT agent will two-box in Newcomb’s problem.
Further, Briggs (2010) argues, though Wedgwood (2011) denies, that BT suffers from other problems. As such, even for those who support two-boxing in Newcomb’s problem, it could be argued that BT doesn’t represent an adequate theory of choice. It is unclear, then, whether BT is a desirable replacement to alternative theories.
11.3. Timeless decision theory (TDT)
Yudkowsky (2010) offers another decision algorithm, timeless decision theory or TDT (see also Altair, 2013). Specifically, TDT is intended as an explicit response to the idea that a theory of rational choice should lead an agent to “win”. As such, it will appeal to those who think it is appropriate to one-box in Newcomb’s problem and chew in the chewing gum problem.
In English, this algorithm can be approximated as saying that an agent ought to choose as if CDT were right but they were determining not their actual decision but rather the result of the abstract computation of which their decision is one concrete instance. Formalizing this decision algorithm would require a substantial document in its own right and so will not be carried out in full here. Briefly, however, TDT is built on top of causal Bayesian networks (Pearl, 2000) which are graphs where the arrows represent causal influence. TDT supplements these graphs by adding nodes representing abstract computations and taking the abstract computation that determines an agent’s decision to be the object of choice rather than the concrete decision itself (see Yudkowsky, 2010 for a more detailed description).
Returning to an informal discussion, an example will help clarify the form taken by TDT: imagine that two perfect replicas of a person are placed in identical rooms and asked to make the same decision. While each replica will make their own decision, in doing so, they will be carrying out the same computational process. As such, TDT will say that the replicas ought to act as if they are determining the result of this process and hence as if they are deciding the behavior of both copies.
Something similar can be said about Newcomb’s problem. In this case it is almost like there is again a replica of the agent: Omega’s model of the agent that it used to predict the agent’s behavior. Both the original agent and this “replica” responds to the same abstract computational process as one another. In other words, both Omega’s prediction and the agent’s behavior are influenced by this process. As such, TDT advises the agent to act as if they are determining the result of this process and, hence, as if they can determine Omega’s box filling behavior. As such, a TDT agent will one-box in order to determine the result of this abstract computation in a way that leads to $1 million being placed in Box B.
TDT also succeeds in other areas. For example, in the chewing gum problem there is no “replica” agent so TDT will decide in line with standard CDT and choose to chew gum. Further, in the prisoner’s dilemma, a TDT agent will lie if its partner is another TDT agent (or a relevantly similar agent). After all, in this case both agents will carry out the same computational process and so TDT will advise that the agent act as if they are determining this process and hence simultaneously determining both their own and their partner’s decision. If so then it is better for the agent that both of them lie than that both of them confess.
However, despite its success, TDT also “loses” in some decision scenarios. For example, in counterfactual mugging, a TDT agent will not choose to give up the $100. This might seem surprising. After all, as with Newcomb’s problem, this case involves Omega predicting the agent’s behavior and hence involves a “replica”. However, this case differs in that the agent knows that the coin came up heads and so knows that they have nothing to gain by giving up the money.
For those who feel that a theory of rational choice should lead an agent to “win”, then, TDT seems like a step in the right direction but further work is required if it is to “win” in the full range of decision scenarios.
11.4. Decision theory and “winning”
In the previous section, I discussed TDT, a decision algorithm that could be advanced as replacements for CDT and EDT. One of the primary motivations for developing TDT is a sense that both CDT and EDT fail to reason in a desirable manner in some decision scenarios. However, despite acknowledging that CDT agents end up worse off in Newcomb's Problem, many (and perhaps the majority of) decision theorists are proponents of CDT. On the face of it, this may seem to suggest that these decision theorists aren't interested in developing a decision algorithm that "wins" but rather have some other aim in mind. If so then this might lead us to question the value of developing one-boxing decision algorithms.
However, the claim that most decision theorists don’t care about finding an algorithm that “wins” mischaracterizes their position. After all, proponents of CDT tend to take the challenge posed by the fact that CDT agents “lose” in Newcomb's problem seriously (in the philosophical literature, it's often referred to as the Why ain'cha rich? problem). A common reaction to this challenge is neatly summarized in Joyce (1999, p. 153-154 ) as a response to a hypothetical question about why, if two-boxing is rational, the CDT agent does not end up as rich as an agent that one-boxes:
In other words, this response distinguishes between the winning decision and the winning type of agent and claims that two-boxing is the winning decision in Newcomb’s problem (even if one-boxers are the winning type of agent). Consequently, insofar as decision theory is about determining which decision is rational, on this account CDT reasons correctly in Newcomb’s problem.
For those that find this response perplexing, an analogy could be drawn to the chewing gum problem. In this scenario, there is near unanimous agreement that the rational decision is to chew gum. However, statistically, non-chewers will be better off than chewers. As such, the non-chewer could ask, “if you’re so smart, why aren’t you healthy?” In this case, the above response seems particularly appropriate. The chewers are less healthy not because of their decision but rather because they’re more likely to have an undesirable gene. Having good genes doesn’t make the non-chewer more rational but simply more lucky. The proponent of CDT simply makes a similar response to Newcomb’s problem: one-boxers aren’t richer because of their decision but rather because of the type of agent that they were when the boxes were filled.
One final point about this response is worth noting. A proponent of CDT can accept the above argument but still acknowledge that, if given the choice before the boxes are filled, they would be rational to choose to modify themselves to be a one-boxing type of agent (as Joyce acknowledged in the above passage and as argued for in Burgess, 2004). To the proponent of CDT, this is unproblematic: if we are sometimes rewarded not for the rationality of our decisions in the moment but for the type of agent we were at some past moment then it should be unsurprising that changing to a different type of agent might be beneficial.
The response to this defense of two-boxing in Newcomb’s problem has been divided. Many find it compelling but others, like Ahmed and Price (2012) think it does not adequately address to the challenge:
Given this, there seem to be two positions one could take on these issues. If the response given by the proponent of CDT is compelling, then we should be attempting to develop a decision theory that two-boxes on Newcomb’s problem. Perhaps the best theory for this role is CDT but perhaps it is instead BT, which many people think reasons better in the psychopath button scenario. On the other hand, if the response given by the proponents of CDT is not compelling, then we should be developing a theory that one-boxes in Newcomb’s problem. In this case, TDT, or something like it, seems like the most promising theory currently on offer.