# Newcomb's Problem: A problem for Causal Decision Theories

8 [deleted] 16 August 2010 11:25AM

This is part of a sequence titled, "Introduction to decision theory"

The previous post is "An introduction to decision theory"

In the previous post I introduced evidential and causal decision theories. The principle question that needs resolving with regards to these is whether using these decision theories leads to making rational decisions. The next two posts will show that both causal and evidential decision theories fail to do so and will try to set the scene so that it’s clear why there’s so much focus given on Less Wrong to developing new decision theories.

Newcomb’s Problem

Newcomb’s Problem asks us to imagine the following situation:

Omega, an unquestionably honest, all knowing agent with perfect powers of prediction, appears, along with two boxes.  Omega tells you that it has placed a certain sum of money into each of the boxes. It has already placed the money and will not now change the amount.  You are then asked whether you want to take just the money that is in the left hand box or whether you want to take the money in both boxes.

However, here’s where it becomes complicated. Using its perfect powers of prediction, Omega predicted whether you would take just the left box (called “one boxing”) or whether you would take both boxes (called “two boxing”).Either way, Omega put \$1000 in the right hand box but filled the left hand box as follows:

If he predicted you would take only the left hand box, he put \$1 000 000 in the left hand box.

If he predicted you would take both boxes, he put \$0 in the left hand box.

Should you take just the left hand box or should you take both boxes?

One argument goes as follows: By the time you are asked to choose what to do, the money is already in the boxes. Whatever decision you make, it won’t change what’s in the boxes. So the boxes can be in one of two states:

1. Left box, \$0. Right box, \$1000.
2. Left box, \$1 000 000. Right box, \$1000.

Whichever state the boxes are in, you get more money if you take both boxes than if you take one. In game theoretic terms, the strategy of taking both boxes strictly dominates the strategy of taking only one box. You can never lose by choosing both boxes.

The only problem is, you do lose. If you take two boxes then they are in state 1 and you only get \$1000. If you only took the left box you would get \$1 000 000.

To many people, this may be enough to make it obvious that the rational decision is to take only the left box. If so, you might want to skip the next paragraph.

Taking only the left box didn’t seem rational to me for a long time. It seemed that the reasoning described above to justify taking both boxes was so powerful that the only rational decision was to take both boxes. I therefore saw Newcomb’s Problem as proof that it was sometimes beneficial to be irrational. I changed my mind when I realized that I’d been asking the wrong question. I had been asking which decision would give the best payoff at the time and saying it was rational to make that decision. Instead, I should have been asking which decision theory would lead to the greatest payoff. From that perspective, it is rational to use a decision theory that suggests you only take the left box because that is the decision theory that leads to the highest payoff. Taking only the left box lead to a higher payoff and it’s also a rational decision if you ask, “What decision theory is it rational for me to use?” and then make your decision according to the theory that you have concluded it is rational to follow.

What follows will presume that a good decision theory should one box on Newcomb’s problem.

Causal Decision Theory and Newcomb’s Problem

Remember that decision theory tells us to calculate the expected utility of an action by summing the utility of each possible outcome of that action multiplied by its probability. In Causal Decision Theory, this probability is defined causally (something that we haven’t formalized and won’t formalise in this introductory sequence but which we have at least some grasp of). So Causal Decision Theory will act as if the probability that the boxes are in state 1 or state 2 above is not influenced by the decision made to one or two box (so let’s say that the probability that the boxes are in state 1 is P and the probability that they’re in state 2 is Q regardless of your decision).

So if you undertake the action of choosing only the left box your expected utility will be equal to: (0 x P) + (1 000 000 x Q) = 1 000 000 x Q

And if you choose both boxes, the expected utility will be equal to: (1000 x P) + (1 001 000 x Q).

So Causal Decision Theory will lead to the decision to take both boxes and hence, if you accept that you should one box on Newcomb’s Problem, Causal Decision Theory is flawed.

Evidential Decision Theory and Newcomb’s Problem

Evidential Decision Theory, on the other hand, will take your decision to one box as evidence that Omega put the boxes in state 2, to give an expected utility of (1 x 1 000 000) + (0 x 0) = 1 000 000.

It will similarly take your decision to take both boxes as evidence that Omega put the boxes into state 1, to give an expected utility of (0 x (1 000 000 + 1000)) + (1 x (0 + 1000)) = 1000

As such, Evidential Decision Theory will suggest that you one box and hence it passes the test posed by Newcomb’s Problem. We will look at a more challenging scenario for Evidential Decision Theory in the next post. For now, we’re part way along the route of realising that there’s still a need to look for a decision theory that makes the logical decision in a wide range of situations.

Appendix 1: Important notes

While the consensus on Less Wrong is that one boxing on Newcomb’s Problem is the rational decision, my understanding is that this opinion is not necessarily held uniformly amongst philosophers (see, for example, the Stanford Encyclopedia of Philosophy’s article on Causal Decision Theory). I’d welcome corrections on this if I’m wrong but otherwise it does seem important to acknowledge where the level of consensus differs on Less Wrong compared to the broader community.

For more details on this, see the results of the PhilPapers Survey where 61% of respondents who specialised in decision theory chose to two box and only 26% chose to one box (the rest were uncertain). Thanks to Unnamed for the link.

If Newcomb's Problem doesn't seem realistic enough to be worth considering then read the responses to this comment.

Appendix 2: Existing posts on Newcomb's Problem

Newcomb's Problem has been widely discussed on Less Wrong, generally by people with more knowledge on the subject than me (this post is included as part of the sequence because I want to make sure no-one is left behind and because it is framed in a slightly different way). Good previous posts include:

A post by Eliezer introducing the problem and discussing the issue of whether one boxing is irrational.

A link to Marion Ledwig's detailed thesis on the issue.

An exploration of the links between Newcomb's Problem and the prisoner's dillemma.

A post about formalising Newcomb's Problem.

And a Less Wrong wiki article on the problem with further links.

Sort By: Best
Comment author: 16 August 2010 04:08:58PM *  5 points [-]

While the consensus on Less Wrong is that one boxing on Newcomb’s Problem is the rational decision, my understanding is that this opinion is not necessarily held uniformly amongst philosophers

That's correct. See, for instance, the PhilPapers Survey of 931 philosophy professors, which found that only 21% favored one boxing vs. 31% who favored two boxing; 43% said other (mostly undecided or insufficiently familiar with the issue). Among the 31 philosophers who specialize in decision theory, there was a big shift from other (down to 13%) to two boxing (up to 61%), and still only 26% favored one boxing.

Comment author: 17 August 2010 11:14:51PM 0 points [-]

I'm not sure I actually believe this survey. Sure, these people claim they'd two box in academic papers, and in surveys - that's easy enough to do - but would any of them actually be committed enough to two-boxing to turn down \$1 million if they every found themselves in the actual set-up?

Comment author: 18 August 2010 04:16:33AM 1 point [-]

My feelings are the opposite. I'm committed to one-boxing (just in case Omega is scanning my brain right now), but I'm not at all sure that I'd stick to that commitment with a box of \$1000 sitting right there in front of me free for the taking. (Don't listen, Omega, move on, nothing to see here).

Comment author: 17 August 2010 05:32:07PM *  4 points [-]

An issue that often occurs to me when discussing these questions. I one-box, cooperate in one-shot PD's, and pay the guy who gave me a lift out of the desert. I've no idea what decision theory I'm using when I decide to do these things, but I still know that that I'd do them. I'm pretty sure that's what most other people would do as well.

Does anyone actually know how Human Decision Theory works? I know there are examples of problems where HDT fails miserably and CDT comes out better, but is there a coherent explanation for how HDT manages to get all of these problems right? Has anyone attempted to develop a decision theory which successfully solves these sorts of problems by mimicking the way in which people successfully solve them?

Comment author: 17 August 2010 05:55:08PM 4 points [-]

I don't think most people one-box. Maybe most LW readers one-box.

Comment author: 17 August 2010 06:17:36PM *  6 points [-]

I don't think most people one-box. Maybe most LW readers one-box.

I have two real boxes, labelled with Newcomb's problem and using 1 and 4 quarters in place of the \$10k and \$1M. I have shown them to people at Less Wrong meetups, and also to various friends of mine, a total of about 20 people.

Almost everyone I've tried it on has one-boxed. Even though I left out the part in the description about being a really accurate predictor, and pre-seeded the boxes before I even knew who would be the one choosing. Maybe it would be different with \$10k instead of \$0.25. Maybe my friends are unusual and a different demographic would two-box. Maybe it's due to a quirk of how I present them. But unless someone presents contrary evidence, I have to conclude that most people are one-boxers.

Comment author: 17 August 2010 11:35:11PM *  9 points [-]

Almost everyone I've tried it on has one-boxed. Even though I left out the part in the description about being a really accurate predictor, and pre-seeded the boxes before I even knew who would be the one choosing.

What?!? You offer people two boxes with essentially random amounts of money in them, and they choose to take one of the boxes instead of both? And these people are otherwise completely sane?

Could you maybe give us details of how exactly you present the problem? I can't imagine any presentation that would make anyone even slightly tempted to one-box this variant. (Maybe if I knew I'd get to play again one day...)

Comment author: 18 August 2010 03:31:31PM 5 points [-]

That seems bizarre to me too. But if Jimrandomh is filling his boxes on the basis of what most people would do, and most people do one-box, then perhaps they are just behaving as rational, highly correlated, timeless decisionmakers.

Comment author: 18 August 2010 04:01:11PM 1 point [-]

A signalling explanation might explain this behavior: people would rather be seen as having gotten the problem correct, or signal non-greediness, than get an extra \$0.25. As evidence for this conclusion, some people turn down the \$1.00 in box one.

No one's given the real correct solution, which is "inspect the boxes more thoroughly". One of them has an extra label on the bottom, offering an extra \$1.00 for finding it if you haven't opened any boxes yet, which I've never had to pay out on. The moral is supposed to be that theory is hard to transfer into the real world and to question assumptions.

Comment author: 18 August 2010 04:17:02PM 4 points [-]

You let people inspect the boxes? Wouldn't they be distinguishable by weight?

Comment author: 22 August 2010 05:49:55PM 7 points [-]

Reminds me of a story, set in a lazy Mark Twain river town. Two friends walking down the street. First says to second, "See that kid? He is really stupid." Second asks, "Why do you say that?" First answers, "Watch". Approaches kid. Holds out nickel in one hand and dime in the other. Asks kid which he prefers. "I'll take the nickel. It's bigger". Man hands nickel to kid with smirk, and the two friends continue on.

Later the second man comes back and attempts to instruct the kid. "A dime is worth twice the value, that is it buys more candy", says he, "even though the nickel looks bigger." The kid gives the man a pitying look. "Ok, if you say so. But I've made seven nickels so far this month. How many dimes have you made?"

Which brings me to my real point - empirical research, I'm sure you have seen it, in which player 1 is asked to specify a split of \$10 between himself and player 2. Player 2 then chooses to accept or reject. If he rejects, neither player gets anything. As I recall, when greedy player 1 specifies more than about 70% for himself, player 2 frequently rejects even though he is costing himself money. This can only be understood in classical "rational agent" game theory by postulating that player 2 does not believe researcher claims that the game is a one-shot.

What is the point? Well, perhaps people who have read about Newcomb problems are assuming (like most people in the research) that, somehow or other, greed will be punished.

Comment author: 22 August 2010 07:34:06PM *  1 point [-]

empirical research, I'm sure you have seen it, in which player 1 is asked to specify a split of \$10 between himself and player 2.

Punishing unfair behavior even when it costs to do so is called altruistic punishment, and this particular experiment is called the Ultimatum Game.

Comment author: 23 August 2010 01:04:45AM *  -1 points [-]

Is it plausible that evolution would gradually push those 70% down to 30% or even lower, given enough time? There may not yet have been enough time for a strong enough group selection in evolution to create such an effect, but sooner or later it should happen, shouldn't it? I'm thinking a species with such a great degree of selflessness would be more likely to survive than the present humanity is, because a larger percentage of them would cooperate about existential risk reduction than is the case in present humanity. Yet, 10-30% is still not 0%, so even with 10% there would still be enough of selfishness to make sure they wouldn't end up refusing each other's gifts until they all starve to death or something.

Can group selection of genes for different psychological constitution in humans already explain why player 1 takes only 70% and not, say, at least 90%, on average, in the game you describe?

What do chimps do? Does a chimp player 1 take more or less than 70%?

Comment author: 23 August 2010 02:46:29AM 1 point [-]

First of all, from the standpoint of the good of the group, I see no reason why player1 shouldn't keep 100% of the money. After all, it is not as if player 2 were starving, and surely the good of player 1 is just as important to the good of the group as is the good of player 2. There is almost no reason for sharing from a standpoint of either Bentham-style utilitarianism or good-of-the-group.

However, there is a reason for sharing when you realize that player 2 is quite reasonably selfish, and has the power to make your life miserable. So, go ahead and give the jerk what he asks for. It is certainly to your own selfish advantage to do so. As long as he doesn't get too greedy.

Comment author: 17 August 2010 06:41:16PM 7 points [-]

Weird. I two-box on that variant.

Comment author: 18 August 2010 04:20:14PM 2 points [-]

I'd like to see this done with a really good mentalist.

Comment author: 18 August 2010 04:11:39AM 1 point [-]

pre-seeded the boxes before I even knew who would be the one choosing

If I met someone in real life who was doing this trick (at least before I started spilling my opinions to the universe through my comments to this blog), I would strongly suspect that you were doing exactly this. And then I would definitely pick both boxes. (Well, first I'd try to figure out if you're likely to offer me any more games, and I'd pick two boxes if I was fairly confident that you would not.) And I would get all of the money, since you would have predicted that I would pick only one box (assuming that you really seed them based on your honest best prediction).

On the other hand, if the situation is not presented as a game (even when I still don't expect any iteration), I pretty consistently cooperate on all of the standard examples (prisoner's dilemma, etc). But since feeling like a moral and cooperative person (except when playing games, of course) has high utility for me, I'm not really playing prisoner's dilemma (etc) after all, so never mind.

Comment author: 22 August 2010 04:36:26PM *  0 points [-]

This is interesting. I suspect this is a selection effect, but if it is true that there is a heavy bias in favor of one boxing among a more representative sample in the actual Newcomb's problem, then a predictor that always predicts one boxing could be suprisingly accurate.

Comment author: 22 August 2010 06:13:15PM 0 points [-]

I read somewhere that about 70% of people one-box. You might be thinking of most philosophers or something like that.

Comment author: 17 August 2010 06:32:06PM 1 point [-]

is there a coherent explanation for how HDT manages to get all of these problems right?

Unilateraly cooperating in one-shot true PD's is not right. If the other player's decision is not correlated with yours, you should defect.

Comment author: 16 August 2010 04:46:45PM 2 points [-]

Thanks for a great post Adam, I'm looking forward to the rest of the series.

This might be missing the point, but I just can't get past it. How does a rational agent come to believe that the being they're facing is "an unquestionably honest, all knowing agent with perfect powers of prediction"?

I have the suspicion that a lot of the bizarreness of this problem comes out of transporting our agent into an epistemologically unattainable state.

Is there a way to phrase a problem of this type in a way that does not require such a state?

Comment author: 16 August 2010 05:21:04PM *  8 points [-]

It's not perfect, per se, but try this:

There's a fellow named James Omega who (with the funding of certain powerful philosophy departments), travels around the country offering random individuals the chance to participate in Newcomb's problem, with James as Omega. Rather than scanning your brain with his magic powers, he spends a day observing you in your daily life, and uses this info to make his decision. Here's the catch: he's done this 300 times, and never once mis-predicted. He's gone up against philosophers and lay-people, people that knew they were being observed and people that didn't, but it makes no difference: he just has an intuition that good. When it comes time to do the experiment, it's set up in such a way that you can be totally sure (and other very prestigous parties have verified) that the amounts in the box do not change after your decision.

So when you're selected, what do you do? Nothing quite supernatural is going on, we just have the James fellow with an amazing track record, and you with no particular reason to believe that you'll be his first failure. Even if he is just human, isn't it rational to assume the ridiculously likely thing (301/302 chance according to Laplace's Law) that he'll guess you correctly? Even if we adjust for the possibility of error, the payoff matrix is still so lopsided that it seems crazy to two-box.

See if that helps, and of course everyone else is free to offer improvements if I've missed something. You know, help get this Least Convenient Possible World going.

Comment author: 16 August 2010 06:06:48PM 3 points [-]

Now I want to read a series of stories starring James Omega in miscellaneous interesting situations. The kind of ability implied by accuracy at Newcomb's Dilemma would seem to imply capability in other situations as well. (If nothing else, he would kill at rock-paper-scissors.)

Comment author: 16 August 2010 08:18:08PM *  2 points [-]

How does a rational agent come to believe that the being they're facing is "an unquestionably honest, all knowing agent with perfect powers of prediction"?

Let's make things clearer by asking the meta-question: is the predictor's implementation, and the process by which we learn of it, relevant to the problem? Let's unpack "relevant": should the answer to Newcomb's Problem depend on these extraneous details about the predictor? And let's unpack "should": if decision theory A tells you to one-box in approximately-Newcomb-like scenarios without requiring further information, and decision theory B says the problem is "underspecified" and the answer is "unstable" and you can't pin it down without learning more about the real-world situation... which decision theory do you like more?

Comment author: 16 August 2010 08:23:39PM 0 points [-]

Decision theory A is by far preferable to me.

Of course, that's assuming that by newcomb-like scenarios you only include those were one-boxing is actually statistically correlated with greater wealth once all other factors are canceled out.

If Decision Theory A's definition of newcomb-like included a scenario where the person was doing well enough to make one-boxing appear to be the winning move, but was actually basing her decisions on hair-colour, then I would be more tempted by Decision Theory B.

IOW: whichever one wins for me :p

Comment author: [deleted] 16 August 2010 05:09:15PM *  2 points [-]

Newcomb's Problem still holds in much more realistic situations. So say someone who knows you really, really well comes up to you and makes the same offer. Imagine you don't mind taking their money and you reckon they know you well enough that they're 80% likely to be correct in their bet. One boxing is still the right decision because you have the following gain from one boxing:

(.8 x 1 000 000) + (.2 x 0) = 800 000

and for two boxing:

(.8 x 1000) + (.2 x 1 001 000) = 800+ 200 200 = 201 000

But Causal Decision Theory will still undertake the same reasoning because your decision still doesn't have a causal influence on whether the boxes are in state 1 or 2. So Causal Decision Theory will still two box.

So Newcomb's Problem still holds in more realistic situations.

Is that the sort of thing you were looking for or have I missed the point?

Comment author: 16 August 2010 08:15:58PM 1 point [-]

Even if you don't believe such a situation can exist, you can make inferences for how you should act in such a case, base on how you should act in realistic cases.

Like AdamBell said, you can consider a more realistic scenario where someone simply has a good chance of guessing what you do.

Then take it a step further: write your decision theory as a function of how accurate the guesser is. Presumably, for the "high but not 100%" accuracy cases, you'll want to one-box. So, in order to have a decision theory that doesn't have some sort of discontinuity, you will have to set it so that it would imply that on a 100% guesser-accuracy case, you should one-box as well.

In short, it's another case of Belief in the Implied Invisible, or implied optimal, as is the case here. While you may not be in a position to test claim X directly, it falls out as an implication of the best theories, which are directly testable.

(I should probably write an article justifying the importance of Newcomb's problem and why it has real implications for our lives -- there are many other ways it's important, such as in predicting the output of a process.)

Comment author: 16 August 2010 07:22:56PM 1 point [-]

If you want a way of phrasing this problem which involves the agent being in an attainable state, this may be of some small interest, Alexandros. A few years back I wrote an article discussing a situation with some similiarities with the one in Newcomb's problem and with an attainable-state agent. While the article doesn't prove anything really profound in philosophy, it might give a useful context. It is here: http://www.paul-almond.com/GameTheoryWithYourself.htm.

Comment author: 16 August 2010 08:00:28PM 0 points [-]

I believe you used to post here as PaulUK, and joined in for this discussion of your website's articles.

Comment author: 16 August 2010 08:07:33PM 0 points [-]

SilasBarta, yes. I decided to change to this username as it is more obvious who I am. I generally use my real name in online discussions of this type: I have it on my website anyway. I don't envisage using the PaulUK name again.

Comment author: 16 August 2010 11:58:18PM 0 points [-]

Others have given good answers; here's another.

Comment author: 16 August 2010 07:35:23PM 0 points [-]

Is there a way to phrase a problem of this type in a way that does not require such a state?

There is, and it is useful to look at such phrasings to allay those suspicions. However once we have looked at the issue enough to separate practical implications of imperfect knowledge from the core problem the simple version becomes more useful. It turns out that the trickiest part becomes unavoidable once we clear out the distractions!

Comment author: 16 August 2010 07:56:44PM 0 points [-]

... it is useful to look at such phrasings [of Newcomb-type problems] to allay those suspicions.

And where, pray tell, might I look?

It turns out that the trickiest part becomes unavoidable once we clear out the distractions!

Asking folks to hypothetically accept the unbelievable does not, IMHO, "clear out distractions".

Comment author: 16 August 2010 08:10:55PM *  4 points [-]

And where, pray tell, might I look?

When I was getting my head around the subject I made them up myself. I considered what the problem would look like if I took out the 'absolute confidence' stuff. For example - forget Omega, replace him with Patrick Jane. Say Jane has played this game 1,000 times before with other people and only got it wrong (and/or lied) 7 times.

I assume you can at least consider TV show entertainment level counterfactuals for the purpose of solving these problems. Analysing the behavior of fictional characters in TV shows is a legitimate use for decision theory.

Asking folks to hypothetically accept the unbelievable does not, IMHO, "clear out distractions".

That would have made things difficult in high school science. Most example problems do exactly that. I distinctly remember considering planes and pulleys that were frictionless.* The only difference here is that the problem is harder (on our intuitions, if nothing else.)

* Did anyone else find it amusing when asked to consider frictionless ropes that were clearly fastened to the 200 kg weights with knots?

Comment author: 16 August 2010 11:45:11AM 2 points [-]

Please link to previous discussions of Newcomb's Problem on LW. They contain many valuable insights that new readers will otherwise have to regenerate (possibly poorly).

Comment author: [deleted] 16 August 2010 11:46:52AM 1 point [-]

Okay. Doing so now.

Comment author: 16 August 2010 11:53:33AM 1 point [-]

Could you fix the spelling of Newcomb while you're at it? Thanks!

Comment author: [deleted] 16 August 2010 11:56:28AM 1 point [-]

And done.

Comment author: 16 August 2010 04:14:30PM 1 point [-]

Does decision theory still matter in a world where there's an agent who's already predicted your choices? Once Omega exists, "decision" is the wrong word - it's really a discovery mechanism for your actions.

Comment author: 16 August 2010 04:22:56PM 7 points [-]

That's the normal meaning of "decision" anyway, unless you believe in acausal free will magic.

Comment author: 17 August 2010 11:13:46PM 2 points [-]

Eliezer has a post on an isomorphic topic:

Timeless Control

You might also like Gary Drescher's treatment of choice in Good and Real.

Comment author: 16 August 2010 04:32:34PM *  2 points [-]

Decisions you make now are informative (in the information-theoretic sense) about your past.

Decisions you make now are informative about the past.

Comment author: 16 August 2010 07:25:19PM 0 points [-]

I agree Dagon - and I actually specifically discussed this issue in the article I referenced in the comment I posted just before this one. Part of what I said was: "There may be one way that we could deal with this issue, and that would be to use different language to describe choices. Conventionally, if I have just picked up a glass we would say that I chose to pick it up. This whole idea of 'choosing' can cause us cognitive difficulties. Maybe it would be better to consider my 'choice' to pick up the glass as really 'finding out' that I was predisposed to pick it up." I also agree with FAWS said - that this is implied by "decision" anyway - at least to anyone who thinks about it enough.

Comment author: 18 August 2010 04:24:44AM *  1 point [-]

A kind of funny way in which something like this might (just about) happen in reality occurs to me: Possible time delay in human awareness of decision making. Suppose when you make a conscious decision, your brain starts to become committed to that decision before you become aware of it, so if you suddenly decide to press a button then your brain was going through the process of committing you to pressing it before you actually knew you were going to press it. That would mean that every time you took a conscious decision to act, based on some rational grounds, you should really have been wanting to be the person who had been predisposed to act in that way a short time ago, when the neural machinery was pushing you towards that decision. I'm not saying this resolves any big issues, but maybe it can be amusingly uncomfortable for a few people - especially given some (admittedly controversial) experiments. In fact, with some brainwave monitoring equipment, a clever experiment design, and a very short experiment duration, you might even be able to set up something slightly resembling Newcomb's paradox!

Comment author: 18 August 2010 11:00:00PM *  2 points [-]

I have a description here of a practical demonstration of Newcomb's paradox that might just be possible, with current or near-future technology. It would rely, simply, on the brain being more predictable over a short span of time. I would be interested to see what people think about the feasibility.

A test subject sits at a desk. On the desk are two buttons. On button "O" corresponds to opening one box. The other button "B" corresponds to opening both boxes. There is a computer, with a display screen. The boxes are going to be computer simulated: A program in the computer has a variable for the amount of money in the each box.

This is how an experimental run proceeds.

The subject sits at the desk for some random amount of time, during which nothing happens.

A "Decision Imminent" message appears on the computer screen. This is to warn the subject that his/her decision is about to be demanded imminently.

A short time after (maybe a second or two, or a few seconds), the computer program decides how much money will go in each box, and it sets the variables accordingly, without showing the user. As soon as that is done, a "Select a box NOW" message appears on the computer screen. The subject now has a (very) limited amount of time to press either button "O" or "B" to select one or both boxes. The subject will have to press one of the buttons almost immediately before the offer is withdrawn.

The subject is then shown the amount of money that was in each box.

Now, here is the catch (and everyone here will have guessed it).

The subject is also wired up to brain monitoring equipment, which is connected to the computer. When the "Decision imminent" message appeared, the computer started to examine the subject's brainwaves, to try to see the decision to press being formed. Just before the "Select a box NOW" message appeared, it used this information to load the simulated boxes according to the rules of the Newcomb's paradox game being discussed here.

I have no idea what level of accuracy could be achieved now, but it may be that some people could be made to have a worrying experience.

Comment author: [deleted] 17 August 2010 12:27:54PM 1 point [-]

I'm considering continuing this sequence on an external blog. There's been some positive responses to these posts but there are also a lot of people who plainly consider that the quality of the posts aren't up to scratch. Moving them to an external site would let people follow them if they wanted to but would stop me from bombarding LW with another five or six posts.

Opinions?

Comment author: 17 August 2010 12:45:05PM 0 points [-]

I don't know - I'm not sure if we want to end up with dozens and dozens of post re-explaining things like Newcomb's problem. Decision Theory was already explained here by Eliezer, then by Anna Salamon ... maybe in a year some other new poster is going to read up on decision theory and decide to post a sequence about it on Less Wrong.

On the other hand, your Decision Theory posts aren't really low-quality by LW standards. They're just covering ground that has already been covered before. I would much prefer posts that quickly gloss over the familiar stuff (linking to the wiki or old sequences as needed), and quickly get to the new stuff.

Comment author: 17 August 2010 01:05:19PM *  5 points [-]

They're just covering ground that has already been covered before. I would much prefer posts that quickly gloss over the familiar stuff (linking to the wiki or old sequences as needed), and quickly get to the new stuff.

I would like to direct this comment to the attention of all the people who wondered why I was apologetic about posting elementary material.

Comment author: 23 August 2010 06:36:54PM 1 point [-]

I think this warrants being on Less Wrong. One of Eliezer's best pieces was his basic explanation of Bayes Theorem, and there are plenty of people who're confused about Decision Theory. This post got 111 comments, and it's hard to see your doing worse than the recent SIAI-flamewar.

Comment author: 18 August 2010 08:49:36PM 1 point [-]

I think that you should finish this sequence on lesswrong.
It is less technical and easier to understand than other posts on Decision Theory, and would therefore be valuable for newcomers.

Comment author: 23 October 2010 10:58:34PM 0 points [-]

And a Less Wrong wiki article on the problem with further links.

At first, I Thought It Meant that you'd add more links, but that's a bad idea, and here's an article on why.

Comment author: 18 August 2010 03:57:46AM *  0 points [-]

all knowing agent with perfect powers of prediction

The existence of an all-knowing agent with perfect powers of prediction makes a mockery of the very idea of causality, at least as I understand it. (I won't go into details here, because it doesn't really matter, as you'll see.) Obviously causal decision theory doesn't work if causality doesn't make sense. However, since I assign negligible probability to the existence of such a being, I can still think that CDT is correct for practical purposes, while remembering that it can break down in extreme situations.

However, this doesn't really matter for your point, which is (in part) based on this principle:

I should have been asking which decision theory would lead to the greatest payoff.

So if we alter the story to make it compatible with causality (as Spurlock did), then the answer is still that CDT does not lead to the greatest payoff.

However (and now I'm finally getting to my point), this doesn't mean that CDT is incorrect! Although it is normally beneficial to know the truth, there are situations in which it is beneficial (and therefore rational, in a decision-theoretic sense) to believe falsehoods, and this may be one of them. (But the positivist in me wants to object that the correctness of CDT, as distinct from the usefulness of belief in it, is not a matter of observable fact and therefore meaningless.)

So I still want to say that I should pick two boxes. But now (now being after discussion of Eliezer's post on the subject) I add that I also should be the type of person who would pick one box, and furthermore this is more important (at least when Newcomb's Problem is the only relevant situation), even if being such a person would lead me to mistakenly pick one box in fact.

Comment author: 16 August 2010 03:28:00PM 0 points [-]

I realized that I’d been asking the wrong question. I had been asking which decision would give the best payoff at the time and saying it was rational to make that decision. Instead, I should have been asking which decision theory would lead to the greatest payoff.

I wonder if it is possible to go one more step: instead of asking which decision theory to use (to make decisions), we should ask which meta-decision theory we should use (to choose decision theories). In that case, maybe we would find ourselves using EDT for Newcomb-like problems (and winning), but a simpler decision theory for some other problems, where EDT is not required to win.

I don't know what a meta-decision theory would look like (I barely know what a decision theory looks like).

Comment author: 16 August 2010 05:26:24PM 2 points [-]

I think that this just gets rolled into your overall decision theory.

For instance, suppose we have two programs. We give all odd numbers to program 1 and it performs some action. We give all even numbers to program 2 and it performs some other action. On the surface, it looks like we've got 2 different programs and a meta level procedure for deciding which to use. But of course, it's trivial to code this whole system up into a single program that takes an integer and does the correct thing with it.

My point being that I think it's misleading to try and suggest two decision theories would be at work in your example. You've just got one big decision theory that does different stuff at different levels (which some decision theories already do anyway).

Comment author: 16 August 2010 03:41:31PM *  1 point [-]

As many of us here secretly hope, the meta-decision theory must "reproduce itself" as the object-level decision theory. Just don't ask me what this means formally.

Comment author: 16 August 2010 04:15:01PM 0 points [-]

That makes sense. It implies that we wouldn't find ourselves using different object-level decision theories in different situations.

(But is it possible to construct a problem analogous to Newcomb's on which EDT loses? If so it seems we would need different object-level DTs after all.)

Comment author: 16 August 2010 05:24:27PM 0 points [-]
Comment author: 16 August 2010 06:08:09PM 0 points [-]

As I wrote elsewhere in this thread, see the Newcomb's variant with transparent boxes, or Parfit's Hitchhiker.

Comment author: 16 August 2010 02:11:16PM 0 points [-]

Causal Decision Theory isn't fatally flawed in this case, it's simply harder to properly apply.

A sufficiently advanced superintelligence could perfectly replicate you or I in a simulation. In fact, I can't currently concieve of a more reliable method of prediction.

Which is where the explanation comes in for Causal Decision Theory. You may be the simulation, if you are the simulation then which box you take DOES affect what is in the boxes.

Comment author: 16 August 2010 02:23:11PM 4 points [-]

The prediction method doesn't have to be very good. A predictor that's only slightly better than chance is quite enough to put EDT and CDT into conflict. For example, I could achieve better than 50% accuracy on LW participants by just reading through their comment history and seeing what they think about Newcomb's Problem.

Comment author: 16 August 2010 02:38:46PM 0 points [-]

Indeed. A 55% accuracy is plenty to make this an issue. And at present, CDT seems to me to fail on the 55% accuracy problem; whereas EDT clearly works.

Comment author: 16 August 2010 02:42:16PM *  2 points [-]

It's easy to construct Newcomb-like problems where EDT fails. For example, we could make the two boxes transparent, so you already see their contents and your action gives you no further evidence. One-boxing is still the right decision because that's what you'd like to be predicted by Omega (alternatively: if you could modify your brain before meeting Omega, that's what you'd precommit to doing), but both EDT and CDT fail to see that. Another similar example is Parfit's Hitchhiker.

Comment author: 16 August 2010 02:49:56PM 0 points [-]

CDT still works in that case if you're dealing with omega, and have no reaason to believe Omega won't simulate you. If you are one of the simulations, you decide the prediction for the real version

Comment author: 16 August 2010 02:52:16PM *  0 points [-]

How about if you're dealing with me?

Comment author: 16 August 2010 02:56:41PM 1 point [-]

Then CDT seems to fail, with it being a low-% case (perhaps 55% as I used above) and EDT fails due to the prize already being in evidence

Comment author: [deleted] 16 August 2010 02:26:36PM 3 points [-]

We could do a modified Newcomb's Problem where the perfectly honest, all knowing Omega tells you that you're not the simulation but the actual person and the simulation has already been done which seems to resolve that possibility discussed above. I don't think you need to though because there's no statement in Newcomb's Problem that says that the predictions do occur via a simulation.

It reminds me of the trolley cart example in ethics where you're told a train is rolling out of control down a hill and will run over 3 people. By hitting a switch you can change the track it goes down and it will instead hit 1 different person. Should you hit the switch?

The specific question isn't relevant to what I'm trying to say but people's responses are.

People will say things like, "Well, I'd just yell at the three people to get off the tracks."

And then you have to specify that they're too far away.

And the person will say, "Well, I'll run toward them yelling so I get close enough in time."

And you have to specify that they're too far away for that as well.

The point is that the people that ask this question are missing the whole idea of the abstraction behind the trolley problem and they're thinking of it as a lateral thinking test rather than a scenario used to make an intellectual point.

I feel that finding a way for CDT to answer Newcomb's Problem via the specifics of the way Omega predicts your reactions is a similar response - trying to respecify the argument in such a way that an answer can be found rather than looking at the abstracted conception of the argument.

As always, I'm open to being shown that I'm wrong and missing something though.

Comment author: 18 August 2010 04:40:48AM *  1 point [-]

the perfectly honest, all knowing Omega tells you that you're not the simulation but the actual person and the simulation has already been done

Then the prediction has been based on a simulation that took place under different circumstances, since Omega (being perfectly honest) did not say this to the simulation.

But as others have said, this is beside the point. After reading all of these irrelevant objections and the irrelevant responses to them, I'm convinced that (at least when addressing people who understand decision theory up to the point of doing calculations with statistics) it's better to phrase the question so that Omega is simply a clever human being who has achieved very high accuracy with very high correlation on a very large number of previous trials, instead of bringing perfection into it.

I'm thinking something like this:

• 30 cases where Omega predicts one-boxing but two-boxing takes place,
• 70 cases where Omega predicts two-boxing but one-boxing takes place,
• 270 cases where Omega predicts two-boxing and two-boxing takes place,
• 630 cases where Omega predicts one-boxing and one-boxing takes place.

Also, make the amounts \$1 and \$1000 so that utility will be very close to linear in amount of money (at least to middle-class First-Worlders like me).

Comment author: 16 August 2010 06:32:36PM 1 point [-]

Would you say the trolley car problem implies that the fat man has a strong obligation to throw himself under the train?

Comment author: 16 August 2010 06:36:39PM *  2 points [-]

I'm not AdamBell, but I think that doesn't follow. The fat man could value his own life higher than the lives of three strangers. But we have no reason to value his life higher too.

Comment author: 16 August 2010 02:34:30PM *  0 points [-]

We could do a modified Newcomb's Problem where the perfectly honest, all knowing Omega tells you that you're not the simulation but the actual person and the simulation has already been done which seems to resolve that possibility discussed above.

An All-knowing Omega by definition contains a simulation of this exact scenario. And in that simulation they aren't being perfectly honest, but I still believe they are.

If Omega is in fact all-knowing, all possible scenarios exist in simulation within it's infinite knowledge.

This is why throwing all-knowing entities into problems always buggers things up

I feel that finding a way for CDT to answer Newcomb's Problem via the specifics of the way Omega predicts your reactions is a similar response - trying to respecify the argument in such a way that an answer can be found rather than looking at the abstracted conception of the argument.

Given the abstracted conception, prediction through simulation seems to be the most probable explanation. This results in CDT working.

It's not starting from wanting CDT to work, it's starting from examining the problem, working out the situation from the evidence, and then working out what CDT would say to do.

If I can't apply reason when using CDT, CDT will fail when I'm presented with an "opportunity" to buy a magic rock that costs £10,000, and will make me win the lottery within a month.

Comment author: 16 August 2010 02:46:04PM 3 points [-]

Sigh.

You are missing the point.

Replace Omega with a genius Psychologist who only gets it right 99% of the time and CDT will have you walk off with \$1000 while correct thinking leaves you with \$1,000,000 almost all of the time, it's just that in that scenario people will uselessly argue that the 1% chance to get lucky somehow makes it rational.

Comment author: 16 August 2010 02:52:18PM *  0 points [-]

How is the genius psychologist likely to be predicting your actions?

To me, it seems probable that he's simulating you, imperfectly, within his own mind.

How would you explain his methodology?

EDIT: to clarify my reasoning, I simulate people, myself included, often. Generally when I want to predict their actions. I'm not very good at it. Were I a genius psychologist, and hence obviously great at simulating people, I don't see why I would be any less likely to simulate people.

Comment author: 16 August 2010 03:07:04PM *  1 point [-]

She doesn't tell you in the scenario.

Maybe she had her grad students talk with you on various subjects and subject you to various stealth psychological experiments over the last 10 years and watched it all on video, all based on your signing an agreement to take part in a psychological experiment that didn't specify a duration 15 years ago that was followed by a dummy experiment and that you promptly forgot about.

Maybe she is secretly your mother.

Maybe she is just that good and tell it by the way you shaked her hand.

In any case 99% shouldn't require imagining the actions of a reflectively indistinguishable from you copy of you.

Comment author: 16 August 2010 03:09:48PM *  -2 points [-]

Those are all ways of her having gathered the evidence.

From the evidence, how has she reached the conclusion?

The most plausible scenario for getting from evidence to conclusion is mental simulation as far as I can tell.

You haven't even proposed a single alternative yet

EDIT: (did you edit this in, or did I miss it?)

In any case 99% shouldn't require imagining the actions of a reflectively indistinguishable from you copy of you.

You expect the copy to be able to tell it's a copy? Why? Why would the psychologist simulate it discovering that it is the copy? When you simulate someone's reaction to possible courses of action, do you simulate them as being aware of being a simulation?

None of my internal simulations have ever been aware of being simulations.

Comment author: 16 August 2010 03:41:02PM *  1 point [-]

In any case 99% shouldn't require imagining the actions of a reflectively indistinguishable from you copy of you.

You expect the copy to be able to tell it's a copy? Why? Why would the psychologist simulate it discovering that it is the copy? When you simulate someone's reaction to possible courses of action, do you simulate them as being aware of being a simulation?

None of my internal simulations have ever been aware of being simulations.

There are four possibilities:

1. The copy never wonders whether it's a copy.
2. The copy wonders about being a copy and concludes that it is.
3. The copy concludes that it cannot be a copy.
4. The copy is from it's point of view reflectively indistinguishable from you.

Only in case 4. will you seriously have to wonder whether you are a copy. In case 1. you will know that you are not as soon as you consider the possibility, case 2. is irrelevant unless you also assume that the real you will also conclude that it's a copy, which is logically inconsistent.

Nevertheless case 1. should be sufficient for predicting the actions you take once you conclude that you are not a copy to a reasonable accuracy.

Comment author: 16 August 2010 03:56:57PM 0 points [-]

Case 1 is sufficient to predict my actions IFF I would never wonder about whether I was a copy.

Given that I would in fact wonder whether I was a copy, and that that thought-process is significant to the scenario, Case 1 seems likely to be woefully inadequate for simulating me.

Case 4 is therefore much more plausible for a genius psychologist (with 99% accuracy) from my PoV.

Comment author: 16 August 2010 04:08:11PM *  0 points [-]

The psychologist tells you that she simply isn't capable of case 4 (there are all sorts of at least somewhat verifiable facts that you would expect yourself to know and that she doesn't [e. g. details about your job that have to make sense and be consistent with a whole web of other details, that she couldn't plausibly have spied out or invented a convincing equivalent thereof herself]). Given that you just wondered you can't be a simulation. What do you do?

Comment author: 16 August 2010 03:26:24PM *  1 point [-]

The stealth psychological experiments could have included an isomorphic problem, or she could be using a more sophisticated version of

• New ager: one box
• Thinks time travel conflicts with free will: two box
• uses EDT: one box
• TDT/UDT; one box
• bog standard CDT: two box
• CDT, but takes simulation hypothesis seriously: one box if thinking it possible that in a simulation, two box otherwise.

Stealth psychological experiments you forgot about allowed her to determine necessary and/or sufficient conditions for you assuming that you might be in a simulation that you yourself are unaware of, and she set the whole thing up in a such a way that she can tell with high confidence whether you do.

Comment author: 16 August 2010 04:01:28PM 0 points [-]

The categorisation possibility is reasonable. Personally I would estimate the probability of 99% accuracy achieved through categorisation lower than the probability of 99% accuracy achieved through mental simulation, but it's certainly a competitive hypothesis.

Comment author: 16 August 2010 04:33:44PM 0 points [-]

Assuming she tells you that she predicted your actions through some unspecified mechanism other than imagining your thought process in sufficient detail for the imagined version to ask itself whether it just exists in her imagination, what do you do?

Comment author: 16 August 2010 08:05:30PM 1 point [-]

An All-knowing Omega by definition contains a simulation of this exact scenario.

No, he doesn't (necessarily). He could prove the inevitable outcome based of elements of the known state of your brain without ever simulating anything. If you read reduction of could you will find a somewhat similar distinction that may make things clearer.

And in that simulation they aren't being perfectly honest, but I still believe they are.

... So we can't conclude this.

If I can't apply reason when using CDT, CDT will fail when I'm presented with an "opportunity" to buy a magic rock that costs £10,000, and will make me win the lottery within a month.

This suggests you don't really understand the problem (or perhaps CDT). That is not the same kind of reasoning.

Comment author: 16 August 2010 08:07:47PM *  1 point [-]

No, he doesn't (necessarily). He could prove the inevitable outcome based of elements of the known state of your brain without ever simulating anything. If you read reduction of could you will find a somewhat similar distinction that may make things clearer.

Does he not know the answer to "what will happen after this" with regards to every point in the scenario?

If he doesn't, is he all-knowing?

If he does know the answer at every point, in what way doesn't he contain the entire scenario?

EDIT: A non-all-knowing superintelligence could presumably find ways other than simulation of getting my answer, as I said simulation just strikes me as the most probable. If you think I should update my probability estimate of the other methods, that's a perfectly reasonable objection to my logic re: a non-all-knowing superint.

Comment author: 16 August 2010 08:33:03PM 0 points [-]

EDIT: A non-all-knowing superintelligence could presumably find ways other than simulation of getting my answer, as I said simulation just strikes me as the most probable.

Certainly. That is what I consider Omega doing when I think about these problems. It is a useful intuition pump, something we can get our head around.

Comment author: 16 August 2010 12:40:42PM *  0 points [-]

Remember that decision theoryTheory tells us to calculate the expected utility

Typo here.

I would also like to see subheadings for "causal says" and "evidential says", probably changing "Decision theory and Newcomb’s problem" just to make it neat. That would make the flow of the text readable at a glance.

Since you are making posts that would be intended to be linked to it is worth spending extra time getting the details right.

Comment author: [deleted] 16 August 2010 12:44:29PM 1 point [-]

I'm wearing out the d-o-n and e keys on my keyboard. Thanks for the comments. Doing another proofread now in light of the number of errors so hopefully that counts as "spending extra time".

Comment author: 16 August 2010 12:53:59PM 0 points [-]

I appreciate your work. I love having posts to link to - saves a lot of time in the long run.

Comment author: [deleted] 16 August 2010 01:30:13PM 0 points [-]

Thanks.

And I will keep it in mind with future posts that if I'm writing something to be linked to, it's worth making the outline clear and making as few mistakes as possible.

Comment author: 16 August 2010 11:43:25AM 0 points [-]

Your link in the appendix goes to the wrong place. Presumably you meant this: http://plato.stanford.edu/entries/decision-causal/

Comment author: [deleted] 16 August 2010 11:47:36AM 1 point [-]

Indeed I do. Can't explain what happened in my brain there. Fixing it now.

Comment author: 16 August 2010 02:23:16PM 0 points [-]

Newcomb's problem proves EDT only by cheating.

Before It presents you with the problem, Omega tests whether you subscribe to CDT or EDT, and puts the million in the box iff you subscribe to EDT. So you'll get more if you subscribe to EDT. So you'll be better off applying heuristics that you're arbitrarily rewarded for, but this doesn't say anything about normal situations (like kissing the sick baby.)

Comment author: 16 August 2010 02:25:40PM *  4 points [-]

The standard reply to your objection is that Newcomb's Problem doesn't actually care about the "ritual of cognition" that you happen to use. It only cares about your answer. You could one-box because you worship Cthulhu, instead of EDT, and still win. For example, I don't subscribe to EDT, but still one-box because I find UDT's solution convincing :-)