This is part of a sequence titled, "Introduction to decision theory"
The previous post is "An introduction to decision theory"
In the previous post I introduced evidential and causal decision theories. The principle question that needs resolving with regards to these is whether using these decision theories leads to making rational decisions. The next two posts will show that both causal and evidential decision theories fail to do so and will try to set the scene so that it’s clear why there’s so much focus given on Less Wrong to developing new decision theories.
Newcomb’s Problem
Newcomb’s Problem asks us to imagine the following situation:
Omega, an unquestionably honest, all knowing agent with perfect powers of prediction, appears, along with two boxes. Omega tells you that it has placed a certain sum of money into each of the boxes. It has already placed the money and will not now change the amount. You are then asked whether you want to take just the money that is in the left hand box or whether you want to take the money in both boxes.
However, here’s where it becomes complicated. Using its perfect powers of prediction, Omega predicted whether you would take just the left box (called “one boxing”) or whether you would take both boxes (called “two boxing”).Either way, Omega put $1000 in the right hand box but filled the left hand box as follows:
If he predicted you would take only the left hand box, he put $1 000 000 in the left hand box.
If he predicted you would take both boxes, he put $0 in the left hand box.
Should you take just the left hand box or should you take both boxes?
An answer to Newcomb’s Problem
One argument goes as follows: By the time you are asked to choose what to do, the money is already in the boxes. Whatever decision you make, it won’t change what’s in the boxes. So the boxes can be in one of two states:
- Left box, $0. Right box, $1000.
- Left box, $1 000 000. Right box, $1000.
Whichever state the boxes are in, you get more money if you take both boxes than if you take one. In game theoretic terms, the strategy of taking both boxes strictly dominates the strategy of taking only one box. You can never lose by choosing both boxes.
The only problem is, you do lose. If you take two boxes then they are in state 1 and you only get $1000. If you only took the left box you would get $1 000 000.
To many people, this may be enough to make it obvious that the rational decision is to take only the left box. If so, you might want to skip the next paragraph.
Taking only the left box didn’t seem rational to me for a long time. It seemed that the reasoning described above to justify taking both boxes was so powerful that the only rational decision was to take both boxes. I therefore saw Newcomb’s Problem as proof that it was sometimes beneficial to be irrational. I changed my mind when I realized that I’d been asking the wrong question. I had been asking which decision would give the best payoff at the time and saying it was rational to make that decision. Instead, I should have been asking which decision theory would lead to the greatest payoff. From that perspective, it is rational to use a decision theory that suggests you only take the left box because that is the decision theory that leads to the highest payoff. Taking only the left box lead to a higher payoff and it’s also a rational decision if you ask, “What decision theory is it rational for me to use?” and then make your decision according to the theory that you have concluded it is rational to follow.
What follows will presume that a good decision theory should one box on Newcomb’s problem.
Causal Decision Theory and Newcomb’s Problem
Remember that decision theory tells us to calculate the expected utility of an action by summing the utility of each possible outcome of that action multiplied by its probability. In Causal Decision Theory, this probability is defined causally (something that we haven’t formalized and won’t formalise in this introductory sequence but which we have at least some grasp of). So Causal Decision Theory will act as if the probability that the boxes are in state 1 or state 2 above is not influenced by the decision made to one or two box (so let’s say that the probability that the boxes are in state 1 is P and the probability that they’re in state 2 is Q regardless of your decision).
So if you undertake the action of choosing only the left box your expected utility will be equal to: (0 x P) + (1 000 000 x Q) = 1 000 000 x Q
And if you choose both boxes, the expected utility will be equal to: (1000 x P) + (1 001 000 x Q).
So Causal Decision Theory will lead to the decision to take both boxes and hence, if you accept that you should one box on Newcomb’s Problem, Causal Decision Theory is flawed.
Evidential Decision Theory and Newcomb’s Problem
Evidential Decision Theory, on the other hand, will take your decision to one box as evidence that Omega put the boxes in state 2, to give an expected utility of (1 x 1 000 000) + (0 x 0) = 1 000 000.
It will similarly take your decision to take both boxes as evidence that Omega put the boxes into state 1, to give an expected utility of (0 x (1 000 000 + 1000)) + (1 x (0 + 1000)) = 1000
As such, Evidential Decision Theory will suggest that you one box and hence it passes the test posed by Newcomb’s Problem. We will look at a more challenging scenario for Evidential Decision Theory in the next post. For now, we’re part way along the route of realising that there’s still a need to look for a decision theory that makes the logical decision in a wide range of situations.
Appendix 1: Important notes
While the consensus on Less Wrong is that one boxing on Newcomb’s Problem is the rational decision, my understanding is that this opinion is not necessarily held uniformly amongst philosophers (see, for example, the Stanford Encyclopedia of Philosophy’s article on Causal Decision Theory). I’d welcome corrections on this if I’m wrong but otherwise it does seem important to acknowledge where the level of consensus differs on Less Wrong compared to the broader community.
For more details on this, see the results of the PhilPapers Survey where 61% of respondents who specialised in decision theory chose to two box and only 26% chose to one box (the rest were uncertain). Thanks to Unnamed for the link.
If Newcomb's Problem doesn't seem realistic enough to be worth considering then read the responses to this comment.
Appendix 2: Existing posts on Newcomb's Problem
Newcomb's Problem has been widely discussed on Less Wrong, generally by people with more knowledge on the subject than me (this post is included as part of the sequence because I want to make sure no-one is left behind and because it is framed in a slightly different way). Good previous posts include:
A post by Eliezer introducing the problem and discussing the issue of whether one boxing is irrational.
A link to Marion Ledwig's detailed thesis on the issue.
An exploration of the links between Newcomb's Problem and the prisoner's dillemma.
A post about formalising Newcomb's Problem.
And a Less Wrong wiki article on the problem with further links.
Hereinafter, "to Know x" means "to be objectively right about x, and to be subjectively 100 percent certain of x, and to have let the former 'completely scientifically cause' the latter (i.e. to have used the former to create the latter in a completely scientific manner), such that it cannot, even theoretically, be the case that something other than the former coincidentally and crucially misleadingly caused the latter - and to Know that all these criteria are met".
Anything that I merely know ("know" being defined as people usually seem to implicitly define it in their use of it), as opposed to Know, may turn out to be wrong (for all that I know). It seems that the more our scientists know, the more they realize that they don't know. Perhaps this "rule" holds forever, for every advancing civilisation (with negligible exceptions)? I think there could not even theoretically be any Knowing in the (or any) world. I conjecture that, much like it's universally theoretically impossible to find a unique integer for every unique real, it's universally theoretically impossible for any being to Know anything at all, such as for example what box(es) a human being will take.
Nick Bostrom's Simulation Argument seems to show that any conceivable being that could theoretically exist might very well (for all he (that being) knows) be living in a computer simulation controlled by a mightier being than himself. This universal uncertainty means that no being could Know that he has perfect powers of prediction over anything whatsoever. Making a "correct prediction" partly due to luck isn't having perfect powers of prediction, and a being who doesn't Know what he is doing cannot predict anything correctly without at least some luck (because without luck, Murphy's law holds). This means that no being could have perfect powers of prediction.
Now let "Omeg" be defined as the closest (in terms of knowledge of the world) to an all Knowing being (Omega) that could theoretically exist. Let A be defined as the part(s) of an Omeg that are fully known by the Omeg itself, and let B be defined as whatever else there may be in an Omeg. I suggest that in no Omeg of at least the size of the Milky Way can the B part be too small to secretly contain mechanisms that could be stealthily keeping the Omeg arbitrarily ignorant by having it falsely perceive arbitrarily much of its own wildest thought experiments (or whatever other unready thoughts it sometimes produces) to be knowledge (or even Knowledge). I therefore suggest that B, in any Omeg, could be keeping its Omeg under the impression that the A part is sufficient for correct prediction of, say, my choice of boxes, while in reality it isn't. Conclusion: no theoretically possible being could perfectly predict any other being's choice of boxes.
You may doubt it, but you can't exclude the possibility. This means you also can't exclude the possibility that whatever implications Newcomb's problem seems to produce that wouldn't occur to people if Omega were replaced by, say, a human psychologist, are implications that occur to people only because the assumption, that there could be such a thing as a perfect predictor of something/anything, is an assumption too unreasonable to be worthy of acceptance, as its crucial underpinnings don't make sense (like it doesn't make sense to assume that there is an integer for every real) - and as it can, because of this, be expected to produce arbitrarily misleading conclusions (about decision theory in this case) - much like many seemingly reasonable but heavily biased extreme thought experiments designed to smear utilitarianism scare even very skilled thinkers into drawing false conclusions about utilitarianism.
Or suppose someone goes to space, experiences weightlessness, thinks: "hey, why doesn't my spaceship seem to exert any gravity on me?" and draws the conclusion: "it's not gravity that keeps people down on Earth; it's just that the Earth sucks". Like that conclusion would be flawed, the conclusion that Newcomb's problem shows that we should replace Causal Decision Theory with Evidential Decision Theory is flawed.
So, to be as faithful to the original Newcomb thought-experiment as is possible within reason, I'd interpret it in the way that just barely rids its premises of theoretical impossibility: I'd take Omega to mean Omeg, as defined above. An Omeg is fallible, but probably most of the time better than me at predicting my behavior, so I should definitely one-box, for the same reason that I should one-box if the predictor were a mere human being who just knew me very well. To risk a million dollar just to possibly get another 1000 dollar just isn't worth it. Causal Decision Theory leads me to this conclusion just fine.
*) You might think B would be "the real" (or "another, smarter") Omeg, by controlling A. But neither B nor A can rationally completely exclude the possibility that the other one of them is in secret control of both of them. So no one of them can have "perfect powers of prediction" over any being whatsoever.
I know nothing! Nothing!