Either box B is already full or already empty.
I'm not going to go into the whole literature, but the dominant consensus in modern decision theory is that one should two-box, and Omega is just rewarding agents with irrational dispositions. This dominant view goes by the name of "causal decision theory".
I suppose causal decision theory assumes causality only works in one temporal direction. Confronted with a predictor that was right 100 out of 100 times, I would think it very likely that backward-in-time causation exists, and take only B. I assume this would, as you say, produce absurd results elsewhere.
Decisions aren't physical.
The above statement is at least hard to defend. Your decisions are physical and occur inside of you... So these two-boxers are using the wrong model amongst these two (see the drawings....) http://lesswrong.com/lw/r0/thou_art_physics/
If you are a part of physics, so is your decision, so it must account for the correlation between your thought processes and the superintelligence. Once it accounts for that, you decide to one box, because you understood the entanglement of the computation done by omega and the physical process going inside your skull.
If the entanglement is there, you are not looking at it from the outside, you are inside the process.
Our minds have this quirk that makes us think there are two moments, you decide, and then you cheat, you get to decide again. But if you are only allowed to decide once, which is the case, you are rational by one-boxing.
You're complicating the problem too much by bringing in issues like regret. Assume for sake of argument that Newcomb's problem is to maximize the amount of money you receive. Don't think about extraneous utility issues.
People seem to have pretty strong opinions about Newcomb's Problem. I don't have any trouble believing that a superintelligence could scan you and predict your reaction with 99.5% accuracy.
I mean, a superintelligence would have no trouble at all predicting that I would one-box... even if I hadn't encountered the problem before, I suspect.
Ultimately you either interpret "superintelligence" as being sufficient to predict your reaction with significant accuracy, or not. If not, the problem is just a straightforward probability question, as explained here, and becomes uninteresting.
Otherwise, if you interpret "superintelligence" as being sufficient to predict your reaction with significant accuracy (especially a high accuracy like >99.5%), the words of this sentence...
And the twist is that Omega has put a million dollars in box B iff Omega has predicted that you will take only box B.
...simply mean "One-box to win, with high confidence."
Summary: After disambiguating "superintelligence" (making the belief that Omega is a superintelligence pay rent), Newcomb's problem turns into either a straightforward probability question or a fairly simple issue of rearranging the words in equivalent ways to make the winning answer readily apparent.
If you won't explicitly state your analysis, maybe we can try 20 questions?
I have suspected that supposed "paradoxes" of evidential decision theory occur because not all the evidence was considered. For example, the fact that you are using evidential decision theory to make the decision.
Agree/disagree?
Hmm, changed my mind, should have thought more before writing... the EDT virus has early symptoms of causing people to use EDT before progressing to terrible illness and death. It seems EDT would then recommend not using EDT.
I one-box, without a moment's thought.
The "rationalist" says "Omega has already left. How could you think that your decision now affects what's in the box? You're basing your decision on the illusion that you have free will, when in fact you have no such thing."
To which I respond "How does that make this different from any other decision I'll make today?"
I think the two box person is confused about what it is to be rational, it does not mean "make a fancy argument," it means start with the facts, abstract from them, and reason about your abstractions.
In this case if you start with the facts you see that 100% of people who take only box B win big, so rationally, you do the same. Why would anyone be surprised that reason divorced from facts gives the wrong answer?
This dilemma seems like it can be reduced to:
There's a seemingly-impossible but vital premise, namely, that your action was already known before you acted. Even if this is completely impossible, it's a premise, so there's no point arguing it.
Another way of thinking of it is that, when someone says, "The boxes are already there, so your decision cannot affect what's in them," he is wrong. It has been assumed that your decision does affect what's in them, so the fact that you cannot imagine how that is possible is wholly irrelevant.
In short, I don't understand how this is controversial when the decider has all the information that was provided.
I'd love to say I'd find some way of picking randomly just to piss Omega off, but I'd probably just one-box it. A million bucks is a lot of money.
It's often stipulated that if Omega predicts you'll use some randomizer it can't predict, it'll punish you by acting as if it predicted two-boxing.
It's a great puzzle. I guess this thread will degenerate into arguments pro and con. I used to think I'd take one box, but I read Joyce's book and that changed my mind.
For the take-one-boxers:
Do you believe, as you sit there with the two boxes in front of you, that their contents are fixed? That there is a "fact of the matter" as to whether box B is empty or not? Or is box B in a sort of intermediate state, halfway between empty and full? If so, do you generally consider that things momentarily out of sight may literally change their physical sta...
Na-na-na-na-na-na, I am so sorry you only got $1000!
Me, I'm gonna replace my macbook pro, buy an apartment and a car and take a two week vacation in the Bahamas, and put the rest in savings!
Point: arguments don't matter, winning does.
Oops. I had replied to this until I saw its parent was nearly 3 years old. So as I don't (quite) waste the typing:
Do you believe, as you sit there with the two boxes in front of you, that their contents are fixed?
Yes.
That there is a "fact of the matter" as to whether box B is empty or not?
Yes.
Or is box B in a sort of intermediate state, halfway between empty and full?
No.
If so, do you generally consider that things momentarily out of sight may literally change their physical states into something indeterminate?
No.
Do you picture box B literally becoming empty and full as you change your opinion back and forth?
If not, if you think box B is definitely either full or empty and there is no unusual physical state describing the contents of that box, then would you agree that nothing you do now can change the contents of the box?
Yes.
And if so, then taking the additional box cannot reduce what you get in box B.
No, it can't. (But it already did.)
If I take both boxes how much money do I get? $1,000
If I take one box how much money do I get? $10,000,000 (or whatever it was instantiated to.)
It seems that my questions were more useful than yours. Perhaps Joyce b...
To quote E.T. Jaynes:
"This example shows also that the major premise, “If A then B” expresses B only as a logical consequence of A; and not necessarily a causal physical consequence, which could be effective only at a later time. The rain at 10 AM is not the physical cause of the clouds at 9:45 AM. Nevertheless, the proper logical connection is not in the uncertain causal direction (clouds =⇒ rain), but rather (rain =⇒ clouds) which is certain, although noncausal. We emphasize at the outset that we are concerned here with logical connections, because some discussions and applications of inference have fallen into serious error through failure to see the distinction between logical implication and physical causation. The distinction is analyzed in some depth by H. A. Simon and N. Rescher (1966), who note that all attempts to interpret implication as expressing physical causation founder on the lack of contraposition expressed by the second syllogism (1–2). That is, if we tried to interpret the major premise as “A is the physical cause of B,” then we would hardly be able to accept that “not-B is the physical cause of not-A.” In Chapter 3 we shall see that attempts to interpret plausible inferences in terms of physical causation fare no better."
@: Hal Finney:
Certainly the box is either full or empty. But the only way to get the money in the hidden box is to precommit to taking only that one box. Not pretend to precommit, really precommit. If you try to take the $1,000, well then I guess you really hadn't precommitted after all. I might vascillate, I might even be unable to make such a rigid precommitment with myself (though I suspect I am), but it seems hard to argue that taking only one box is not the correct choice.
I'm not entirely certain that acting rationally in this situation doesn't require an element of doublethink, but thats a topic for another post.
I would be interested in know if your opinion would change if the "predictions" of the super-being were wrong .5% of the time, and some small number of people ended up with the $1,001,000 and some ended up with nothing. Would you still 1 box it?
I suppose I might still be missing something, but this still seems to me just a simple example of time inconsistency, where you'd like to commit ahead of time to something that later you'd like to violate if you could. You want to commit to taking the one box, but you also want to take the two boxes later if you could. A more familiar example is that we'd like to commit ahead of time to spending effort to punish people who hurt us, but after they hurt us we'd rather avoid spending that effort as the harm is already done.
If I know that the situation has resolved itself in a manner consistent with the hypothesis that Omega has successfully predicted people's actions many times over, I have a high expectation that it will do so again.
In that case, what I will find in the boxes is not independent of my choice, but dependent on it. By choosing to take two boxes, I cause there to be only $1,000 there. By choosing to take only one, I cause there to be $1,000,000. I can create either condition by choosing one way or another. If I can select between the possibilities, I prefer...
I don't know the literature around Newcomb's problem very well, so excuse me if this is stupid. BUT: why not just reason as follows:
a) the state of affairs whether you pick the box or not is already absolutely determined (i.e. we live in a fatalistic universe, at least with respect to your box-picking)
b) your box picking is not determined, but it has backwards causal force, i.e. something is moving backwards through time.
If a), then practical reason is ...
Laura,
Once we can model the probabilities of the various outcomes in a noncontroversial fashion, the specific choice to make depends on the utility of the various outcomes. $1,001,000 might be only marginally better than $1,000,000 -- or that extra $1,000 could have some significant extra utility.
If we assume that Omega almost never makes a mistake and we allow the chooser to use true randomization (perhaps by using quantum physics) in making his choice, then Omega must make his decision in part through seeing into the future. In this case the chooser should obviously pick just B.
Hanson: I suppose I might still be missing something, but this still seems to me just a simple example of time inconsistency
In my motivations and in my decision theory, dynamic inconsistency is Always Wrong. Among other things, it always implies an agent unstable under reflection.
A more familiar example is that we'd like to commit ahead of time to spending effort to punish people who hurt us, but after they hurt us we'd rather avoid spending that effort as the harm is already done.
But a self-modifying agent would modify to not rather avoid it.
Gowder: If...
I don't see why this needs to be so drawn out.
I know the rules of the game. I also know that Omega is super intelligent, namely, Omega will accurately predict my action. Since Omega knows that I know this, and since I know that he knows I know this, I can rationally take box B, content in my knowledge that Omega has predicted my action correctly.
I don't think it's necessary to precommit to any ideas, since Omega knows that I'll be able to rationally deduce the winning action given the premise.
We don't even need a superintelligence. We can probably predict on the basis of personality type a person's decision in this problem with an 80% accuracy, which is already sufficient that a rational person would choose only box B.
The possibility of time inconsistency is very well established among game theorists, and is considered a problem of the game one is playing, rather than a failure to analyze the game well. So it seems you are disagreeing with most all game theorists in economics as well as most decision theorists in philosophy. Maybe perhaps they are right and you are wrong?
The interesting thing about this game is that Omega has magical super-powers that allow him to know whether or not you will back out on your commitment ahead of time, and so you can make your commitment credible by not being going to back out on your commitment. If that makes any sense.
Robin, remember I have to build a damn AI out of this theory, at some point. A self-modifying AI that begins anticipating dynamic inconsistency - that is, a conflict of preference with its own future self - will not stay in such a state for very long... did the game theorists and economists work a standard answer for what happens after that?
If you like, you can think of me as defining the word "rationality" to refer to a different meaning - but I don't really have the option of using the standard theory, here, at least not for longer than 50 milliseconds.
If there's some nonobvious way I could be wrong about this point, which seems to me quite straightforward, do let me know.
In reality, either I am going to take one box or two. So when the two-boxer says, "If I take one box, I'll get amount x," and "If I take two boxes, I'll get amount x+1000," one of these statements is objectively counterfactual. Let's suppose he is going to in fact take both boxes. Then his second takement is factual and his first statement counterfactual. Then his two statements are:
1)Although I am not in fact going to take only one box, were I to take only box, I would get amount x, namely the amount that would be in the box.
2)I am in ...
Eleizer: whether or not a fixed future poses a problem for morality is a hotly disputed question which even I don't want to touch. Fortunately, this problem is one that is pretty much wholly orthogonal to morality. :-)
But I feel like in the present problem the fixed future issue is a key to dissolving the problem. So, assume the box decision is fixed. It need not be the case that the stress is fixed too. If the stress isn't fixed, then it can't be relevant to the box decision (the box is fixed regardless of your decision between stress and no-stress)....
Paul, being fixed or not fixed has nothing to do with it. Suppose I program a deterministic AI to play the game (the AI picks a box.)
The deterministic AI knows that it is deterministic, and it knows that I know too, since I programmed it. So I also know whether it will take one or both boxes, and it knows that I know this.
At first, of course, it doesn't know itself whether it will take one or both boxes, since it hasn't completed running its code yet. So it says to itself, "Either I will take only one box or both boxes. If I take only one box, the pro...
I practice historical European swordsmanship, and those Musashi quotes have a certain resonance to me*. Here is another (modern) saying common in my group:
If it's stupid, but it works, then it ain't stupid.
Eliezer, I don't read the main thrust of your post as being about Newcomb's problem per se. Having distinguished between 'rationality as means' to whatever end you choose, and 'rationality as a way of discriminating between ends', can we agree that the whole specks / torture debate was something of a red herring ? Red herring, because it was a discussion on using rationality to discriminate between ends, without having first defined one's meta-objectives, or, if one's meta-objectives involved hedonism, establishing the rules for performing math over subje...
Unknown: your last question highlights the problem with your reasoning. It's idle to ask whether I'd go and jump off a cliff if I found my future were determined. What does that question even mean?
Put a different way, why should we ask an "ought" question about events that are determined? If A will do X whether or not it is the case that a rational person will do X, why do we care whether or not it is the case that a rational person will do X? I submit that we care about rationality because we believe it'll give us traction on our problem of ...
Paul, it sounds like you didn't understand. A chess playing computer program is completely deterministic, and yet it has to consider alternatives in order to make its move. So also we could be deterministic and we would still have to consider all the possibilities and their benefits before making a move.
So it makes sense to ask whether you would jump off a cliff if you found out that the future is determined. You would find out that the future is determined without knowing exactly which future is determined, just like the chess program, and so you would ha...
I do understand. My point is that we ought not to care whether we're going to consider all the possibilities and benefits.
Oh, but you say, our caring about our consideration process is a determined part of the causal chain leading to our consideration process, and thus to the outcome.
Oh, but I say, we ought not to care* about that caring. Again, recurse as needed. Nothing you can say about the fact that a cognition is in the causal chain leading to a state of affairs counts as a point against the claim that we ought not to care about whether or not we have that cognition if it's unavoidable.
The paradox is designed to give your decision the practical effect of causing Box B to contain the money or not, without actually labeling this effect "causation." But I think that if Box B acts as though its contents are caused by your choice, then you should treat it as though they were. So I don't think the puzzle is really something deep; rather, it is a word game about what it means to cause something.
Perhaps it would be useful to think about how Omega might be doing its prediction. For example, it might have the ability to travel into the f...
I have two arguments for going for Box B. First, for a scientist it's not unusual that every rational argument (=theory) predicts that only two-boxing makes sense. Still, if the experiment again and again refutes that, it's obviously the theory that's wrong and there's obviously something more to reality than that which fueled the theories. Actually, we even see dilemmas like Newcomb's in the contextuality of quantum measurements. Measurement tops rationality or theory, every time. That's why science is successful and philosophy is not.
Second, there's no q...
Paul, if we were determined, what would you mean when you say that "we ought not to care"? Do you mean to say that the outcome would be better if we didn't care? The fact that the caring is part of the causal chain does have something to do with this: the outcome may be determined by whether or not we care. So if you consider one outcome better than another (only one really possible, but both possible as far as you know), then either "caring" or "not caring" might be preferable, depending on which one would lead to each outcome.
Eliezer, if a smart creature modifies itself in order to gain strategic advantages from committing itself to future actions, it must think could better achieve its goals by doing so. If so, why should we be concerned, if those goals do not conflict with our goals?
I think Anonymous, Unknown and Eliezer have been very helpful so far. Following on from them, here is my take:
There are many ways Omega could be doing the prediction/placement and it may well matter exactly how the problem is set up. For example, you might be deterministic and he is precalculating your choice (much like we might be able to do with an insect or computer program), or he might be using a quantum suicide method, (quantum) randomizing whether the million goes in and then destroying the world iff you pick the wrong option (This will lead to us ...
Be careful of this sort of argument, any time you find yourself defining the "winner" as someone other than the agent who is currently smiling from on top of a giant heap.
This made me laugh. Well said!
There's only one question about this scenario for me - is it possible for a sufficiently intelligent being to fully, fully model an individual human brain? If so, (and I think it's tough to argue 'no' unless you think there's a serious glass ceiling for intelligence) choose box B. If you try and second-guess (or, hell, googolth-guess) Omega, you're ...
How does the box know? I could open B with the intent of opening only B or I could open B with the intent of then opening A. Perhaps Omega has locked the boxes such that they only open when you shout your choice to the sky. That would beat my preferred strategy of opening B before deciding which to choose. I open boxes without choosing to take them all the time.
Are our common notions about boxes catching us here? In my experience, opening a box rarely makes nearby objects disintegrate. It is physically impossible to "leave $1000 on the table,&qu...
Eliezer, if a smart creature modifies itself in order to gain strategic advantages from committing itself to future actions, it must think could better achieve its goals by doing so. If so, why should we be concerned, if those goals do not conflict with our goals?
Well, there's a number of answers I could give to this:
*) After you've spent some time working in the framework of a decision theory where dynamic inconsistencies naturally Don't Happen - not because there's an extra clause forbidding them, but because the simple foundations just don't give rise t...
So it seems you are disagreeing with most all game theorists in economics as well as most decision theorists in philosophy. Maybe perhaps they are right and you are wrong?
Maybe perhaps we are right and they are wrong?
The issue is to be decided, not by referring to perceived status or expertise, but by looking at who has the better arguments. Only when we cannot evaluate the arguments does making an educated guess based on perceived expertise become appropriate.
Again: how much do we want to bet that Eliezer won't admit that he's wrong in this case? Do we have someone willing to wager another 10 credibility units?
Caledonian: you can stop talking about wagering credibility units now, we all know you don't have funds for the smallest stake.
Ben Jones: if we assume that Omega is perfectly simulating the human mind, then when we are choosing between B and A+B, we don't know whether we are in reality or simulation. In reality, our choice does not affect the million, but in the simulation this will. So we should reason "I'd better take only box B, because if this is the simulation then that will change whether or not I get the million in reality".
There is a big difference between having time inconsistent preferences, and time inconsistent strategies because of the strategic incentives of the game you are playing. Trying to find a set of preferences that avoids all strategic conflicts between your different actions seems a fool's errand.
What we have here is an inability to recognize that causality no longer flows only from 'past' to 'future'.
If we're given a box that could contain $1,000 or nothing, we calculate the expected value of the superposition of these two possibilities. We don't actually expect that there's a superposition within the box - we simply adopt a technique to help compensate for what we do not know. From our ignorant perspective, either case could be real, although in actuality either the box has the money or it does not.
This is similar. The amount of money in the b...
How about simply multiplying? Treat Omega as a fair coin toss. 50% of a million is half-a-million, and that's vastly bigger than a thousand. You can ignore the question of whether omega has filled the box, in deciding that the uncertain box is more important. So much more important, that the chance of gaining an extra 1000 isn't worth the bother of trying to beat the puzzle. You just grab the important box.
After you've spent some time working in the framework of a decision theory where dynamic inconsistencies naturally Don't Happen - not because there's an extra clause forbidding them, but because the simple foundations just don't give rise to them - then an intertemporal preference reversal starts looking like just another preference reversal.
... Roughly, self-modifying capability in a classical causal decision theorist doesn't fix the problem that gives rise to the intertemporal preference reversals, it just makes one temporal self win out over all the oth...
There is a big difference between having time inconsistent preferences, and time inconsistent strategies because of the strategic incentives of the game you are playing.
I can see why a human would have time-inconsistent strategies - because of inconsistent preferences between their past and future self, hyperbolic discounting functions, that sort of thing. I am quite at a loss to understand why an agent with a constant, external utility function should experience inconsistent strategies under any circumstance, regardless of strategic incentives. Expected...
The entire issue of casual versus inferential decision theory, and of the seemingly magical powers of the chooser in the Newcomb problem, are serious distractions here, as Eliezer has the same issue in an ordinary commitment situation, e.g., punishment. I suggest starting this conversation over from such an ordinary simple example.
Let me restate: Two boxes appear. If you touch box A, the contents of box B are vaporized. If you attempt to open box B, box A and it's contents are vaporized. Contents as previously specified. We could probably build these now.
Experimentally, how do we distinguish this from the description in the main thread? Why are we taking Omega seriously when if the discussion dealt with the number of angels dancing on the head of pin the derision would be palpable? The experimental data point to taking box B. Even if Omega is observed delivering the boxes, and making the specified claims regarding their contents, why are these claims taken on faith as being an accurate description of the problem?
Let's take Bayes seriously.
Sometime ago there was a posting about something like "If all you knew was that the past 5 mornings the sun rose, what would you assign the probability the that sun would rise next morning? It came out so something like 5/6 or 4/5 or so.
But of course that's not all we know, and so we'd get different numbers.
Now what's given here is that Omega has been correct on a hundred occasions so far. If that's all we know, we should estimate the probability of him being right next time at about 99%. So if you're a one-boxer your exp...
Eliezer, I have a question about this: "There is no finite amount of life lived N where I would prefer a 80.0001% probability of living N years to an 0.0001% chance of living a googolplex years and an 80% chance of living forever. This is a sufficient condition to imply that my utility function is unbounded."
I can see that this preference implies an unbounded utility function, given that a longer life has a greater utility. However, simply stated in that way, most people might agree with the preference. But consider this gamble instead:
A: Live 5...
they would just insist that there is an important difference between deciding to take only box B at 7:00am vs 7:10am, if Omega chooses at 7:05am
But that's exactly what strategic inconsistency is about. Even if you had decided to take only box B at 7:00am, by 7:06am a rational agent will just change his mind and choose to take both boxes. Omega knows this, hence it will put nothing into box B. The only way out is if the AI self-commits to take only box B is a way that's verifiable by Omega.
When the stakes are high enough I one-box, while gritting my teeth. Otherwise, I'm more interested in demonstrating my "rationality" (Eliezer has convinced me to use those quotes).
Perhaps we could just specify an agent that uses reverse causation in only particular situations, as it seems that humans are capable of doing.
Paul G, almost certainly, right? Still, as you say, it has little bearing on one's answer to the question.
In fact, not true, it does. Is there anything to stop myself making a mental pact with all my simulation buddies (and 'myself', whoever he be) to go for Box B?
In arguing for the single box, Yudkowsky has made an assumption that I disagree with: at the very end, he changes the stakes and declares that your choice should still be the same.
My way of looking at it is similar to what Hendrik Boom has said. You have a choice between betting on Omega being right and betting on Omega being wrong.
A = Contents of box A
B = What may be in box B (if it isn't empty)
A is yours, in the sense that you can take it and do whatever you want with it. One thing you can do with A is pay it for a chance to win B if Omega is right. Y...
IMO there's less to Newcomb's paradox than meets the eye. It's basically "A future-predicting being who controls the set of choices could make rational choices look silly by making sure they had bad outcomes". OK, yes, he could. Surprised?
What I think makes it seem paradoxical is that the paradox both assures us that Omega controls the outcome perfectly, and cues us that this isn't so ("He's already left" etc). Once you settle what it's really saying either way, the rest follows.
Yes, this is really an issue of whether your choice causes Omega's action or not. The only way for Omega to be a perfect predictor is for your choice to actually cause Omega's action. (For example, Omega 'sees the future' and acts based on your choice). If your choice causes Omega's action, then choosing B is the rational decision, as it causes the box to have the million.
If your choice does not cause Omega's action, then choosing both boxes is the winning approach. in this case, Omega is merely giving big awards to some people and small awards to ot...
the dominant consensus in modern decision theory is that one should two-box...there's a common attitude that "Verbal arguments for one-boxing are easy to come by, what's hard is developing a good decision theory that one-boxes"
Those are contrary positions, right?
Robin Hason:
Punishment is ordinary, but Newcomb's problem is simple! You can't have both.
The advantage of an ordinary situation like punishment is that game theorists can't deny the fact on the ground that governments exist, but they can claim it's because we're all irrational, which doesn't leave many directions to go in.
I agree that "rationality" should be the thing that makes you win but the Newcomb paradox seems kind of contrived.
If there is a more powerful entity throwing good utilities at normally dumb decisions and bad utilities at normally good decisions then you can make any dumb thing look genius because you are under different rules than the world we live in at present.
I would ask Alpha for help and do what he tells me to do. Alpha is an AI that is also never wrong when it comes to predicting the future, just like Omega. Alpha would examine omega and ...
To me, the decision is very easy. Omega obviously possesses more prescience about my box-taking decision than I do myself. He's been able to guess correct in the past, so I'd see no reason to doubt him with myself. With that in mind, the obvious choice is to take box B.
If Omega is so nearly always correct, then determinism is shown to exist (at least to some extent). That being the case, causality would be nothing but an illusion. So I'd see no problem with it working in "reverse".
Fascinating. A few days after I read this, it struck me that a form of Newcomb's Problem actually occurs in real life--voting in a large election. Here's what I mean.
Say you're sitting at home pondering whether to vote. If you decide to stay home, you benefit by avoiding the minor inconvenience of driving and standing in line. (Like gaining $1000.) If you decide to vote, you'll fail to avoid the inconvenience, meanwhile you know your individual vote almost certainly won't make a statistical difference in getting your candidate elected. (Which would be like...
"If it ever turns out that Bayes fails - receives systematically lower rewards on some problem, relative to a superior alternative, in virtue of its mere decisions - then Bayes has to go out the window."
What exactly do you mean by mere decisions? I can construct problems where agents that use few computational resources win. Bayesian agents by your own admission have to use energy to get in mutual information with the environment (a state I am still suspecious of), so they have to use energy, meaning they lose.
The premise is that a rational agent would start out convinced that this story about the alien that knows in advance what they'll decide appears to be false.
The Kolomogorov complexity of the story about the alien is very large because we have to hypothesize some mechanism by which it can extrapolate the contents of minds. Even if I saw the alien land a million times and watched the box-picking connect with the box contents as they're supposed to, it is simpler to assume that the boxes are some stage magic trick, or even that they are an exception to the u...
It is not possible for an agent to make a rational choice between 1 or 2 boxes if the agent and Omega can both be simulated by Turing machines. Proof: Omega predicts the agent's decision by simulating it. This requires Omega to have greater algorithmic complexity than the agent (including the nonzero complexity of the compiler or interpreter). But a rational choice by the agent requires that it simulate Omega, which requires that the agent have greater algorithmic complexity instead.
In other words, the agent X, with complexity K(X), must model Omega whi...
Okay, maybe I am stupid, maybe I am unfamiliar with all the literature on the problem, maybe my English sucks, but I fail to understand the following:
-
Is the agent aware of the fact that one boxers get 1 000 000 at the moment Omega "scans" him and presents the boxes?
OR
Is agent told about this after Omega "has left"?
OR
Is agent unaware of the fact that Omega rewards one-boxers at all?
-
P.S.: Also, as most "decision paradoxes", this one will have different solutions depending on the context (is the agent a starving child in Africa, or a "megacorp" CEO)
I'm a convinced two-boxer, but I'll try to put my argument without any bias. It seems to me the way this problem has been put has been an attempt to rig it for the one boxers. When we talk about "precommitment" it is suggested the subject has an advance knowledge of Omega and what is to happen. The way I thought the paradox worked, was that Omega would scan/analyze a person and make its prediction, all before the person ever heard of the dilemna. Therefore, a person has no way to develop an intention of being a one-boxer or a two-boxer t...
If the alien is able to predict your decision, it follows that your decision is a function of your state at the time the alien analyzes you. Then, there is no meaningful question of "what should you do?" Either you are in a universe in which you are disposed to choose the one box AND the alien has placed the million dollars, or you are in a universe in which you are disposed to take both boxes AND the alien has placed nothing. If the former, you will have the subjective experience of "deciding to take the one box", which is itself a det...
Yes, but when I tried to write it up, I realized that I was starting to write a small book. And it wasn't the most important book I had to write, so I shelved it. My slow writing speed really is the bane of my existence. The theory I worked out seems, to me, to have many nice properties besides being well-suited to Newcomblike problems. It would make a nice PhD thesis, if I could get someone to accept it as my PhD thesis. But that's pretty much what it would take to make me unshelve the project. Otherwise I can't justify the time expenditure, not at ...
Isn't this the exact opposite arguement from the one that was made in Dust Specks vs 50 Years of Torture?
Correct me if I'm wrong, but the argument in this post seems to be "Don't cling to a supposedly-perfect 'causal decision theory' if it would make you lose gracefully, take the action that makes you WIN."
And the argument for preferring 50 Years of Torture over 3^^^3 Dust Specks is that "The moral theory is perfect. It must be clung to, even when the result is a major loss."
How can both of these be true?
(And yes, I am defining "pr...
One belated point, some people seem to think that Omega's successful prediction is virtually impossible and that the experiment is a purely fanciful speculation. However it seems to me entirely plausible that having you fill out a questionnaire while being brain scanned might well bring this situation into practicality in the near future. The questions, if filled out correctly, could characterize your personality type with enough accuracy to give a very strong prediction about what you will do. And if you lie, in the future that might be detected with a br...
Somehow I'd never thought of this as a rationalist's dilemma, but rather a determinism vs free will illustration. I still see it that way. You cannot both believe you have a choice AND that Omega has perfect prediction.
The only "rational" (in all senses of the word) response I support is: shut up and multiply. Estimate the chance that he has predicted wrong, and if that gives you +expected value, take both boxes. I phrase this as advice, but in fact I mean it as prediction of rational behavior.
In my motivations and in my decision theory, dynamic inconsistency is Always Wrong. Among other things, it always implies an agent unstable under reflection.
If you really want to impress an inspector who can see your internal state, by altering your utility function to conform to their wishes, then one strategy would be to create a trusted external "brain surgeon" agent with the keys to your utility function to change it back again after your utility function has been inspected - and then forget all about the existence of the surgeon.
The inspector will be able to see the lock on your utility function - but those are pretty standard issue.
As a rationalist, it might be worthwhile to take the one box just so those Omega know-it-alls will be wrong for once.
If random number generators not determinable by Omega exist, generate one bit of entropy. If not, take the million bucks. Quantum randomness anyone?
Given how many times Eliezer has linked to it, it's a little surprising that nobody seems to have picked up on this yet, but the paragraph about the utility function not being up for grabs seems to have a pretty serious technical flaw:
There is no finite amount of life lived N where I would prefer a 80.0001% probability of living N years to an 0.0001% chance of living a googolplex years and an 80% chance of living forever. This is a sufficient condition to imply that my utility function is unbounded.
Let p = 80% and let q be one in a million. I'm pretty...
Benja, the notion is that "live forever" does not have any finite utility, since it is bounded below by a series of finite lifetimes whose utility increases without bound.
thinks -- Okay, so if I understand you correctly now, the essential thing I was missing that you meant to imply was that the utility of living forever must necessarily be equal to (cannot be larger than) the limit of the utilities of living a finite number of years. Then, if u(live forever) is finite, p times the difference between u(live forever) and u(live n years) must become arbitrarily small, and thus, eventually smaller than q times the difference between u(live n years) and u(live googolplex years). You then arrive at a contradiction, from which you...
There are two ways of thinking about the problem.
1. You see the problem as decision theorist, and see a conflict between the expected utility recommendation and the dominance principle. People who have seen the problem this way have been led into various forms of causal decision theory.
2. You see the problem as game theorist, and are trying to figure out the predictor's utility function, what points are focal and why. People who have seen the problem this way have been led into various discussions of tacit coordination.
Newcomb's scenario is a paradox, ...
Re: First, foremost, fundamentally, above all else: Rational agents should WIN.
When Deep Blue beat Gary Kasparov, did that prove that Gary Kasparov was "irrational"?
It seems as though it would be unreasonable to expect even highly rational agents to win - if pitted against superior competition. Rational agents can lose in other ways as well - e.g. by not having access to useful information.
Since there are plenty of ways in which rational agents can lose, "winning" seems unlikely to be part of a reasonable definition of rationality.
I think I've solved it.
I'm a little late to this, and given the amount of time people smarter than myself have spent thinking about this it seems naive even to myself to think that I have found a solution to this problem. That being said, try as I might, I can't find a good counter argument to this line of reasoning. Here goes...
The human brain's function is still mostly a black box to us, but the demonstrated predictive power of this alien is strong evidence that this is not the case with him. If he really can predict human decisions, than the mere fact ...
Cross-posting from Less Wrong, I think there's a generalized Russell's Paradox problem with this theory of rationality:
I don't think I buy this for Newcomb-like problems. Consider Omega who says, "There will be $1M in Box B IFF you are irrational."Rationality as winning is probably subject to a whole family of Russell's-Paradox-type problems like that. I suppose I'm not sure there's a better notion of rationality.
Eliezer, why didn't you answer the question I asked at the beginning of the comment section of this post?
The 'delayed choice' experiments of Wheeler & others appear to show a causality that goes backward in time. So, I would take just Box B.
I would use a true quantum random generator. 51% of the time I would take only one box. Otherwise I would take two boxes. Thus Omega has to guess that I will only take one box, but I have a 49% chance of taking home another $1000. My expected winnings will be $1000490 and I am per Eliezer's definition more rational than he.
I'm a bit nervous, this is my first comment here, and I feel quite out of my league.
Regarding the "free will" aspect, can one game the system? My rational choice would be to sit right there, arms crossed, and choose no box. Instead, having thus disproved Omega's infallibility, I'd wait for Omega to come back around, and try to weasel some knowledge out of her.
Rationally, the intelligence that could model mine and predict my likely action (yet fail to predict my inaction enough to not bother with me in the first place), is an intelligence I'd like...
I've come around to the majority viewpoint on the alien/Omega problem. It seems to be easier to think about when you pin it down a bit more mathematically.
Let's suppose the alien determines the probability of me one-boxing is p. For the sake of simplicity, let's assume he then puts the 1M into one of the boxes with this probability p. (In theory he could do it whenever p exceeded some thresh-hold, but this just complicates the math.)
Therefore, once I encounter the situation, there are two possible states:
a) with probability p there is 1M in one box, and 1k...
There is no finite amount of life lived N where I would prefer a 80.0001% probability of living N years to an 0.0001% chance of living a googolplex years and an 80% chance of living forever. This is a sufficient condition to imply that my utility function is unbounded.
Wait a second, the following bounded utility function can explain the quoted preferences:
Benja Fallenstein gave an alternative formulation that does imply an unbounded utility function:
For all n, there is an even larger n' such that (p+q)*u(live n years) < p*u(live n' years) + q*(live a googolplex years).
But these preferences are pretty counter-intuitive to me. If U(live n years) is unbounded, then the above must hold for any nonzero p, q, and with "googolplex" replaced by any finite number. For example, let p = 1/3^^^3, q = .8, n = 3^^^3, and replace "googolplex" with "0". Would you really be willing to give up .8 probability of 3^^^3 years of life for a 1/3^^^3 chance at a longer (but still finite) one? And that's true no matter how many up-arrows we add to these numbers?
Eliezer, would you be willing to bet all of your assets and future earnings against $1 of my money, that we can do an infinite amount of computation before the universe ends or becomes incapable of supporting life?
Your answer ought to be yes, if your preferences are what you state. If it turns out that we can do an infinite amount of computation before the universe ends, then this bet increases your money by $1, which allows you to increase your chance of having an infinite lifetime by some small but non-zero probability. If it turns out that our universe can't do an infinite amount of computation, you lose a lot, but the loss of expected utility is still tiny compared to what you gain.
So, is it a bet?
Also, why do you suspect that answering "No" would enable someone to demonstrate circular / inconsistent preferences on your part?
Actually, I think I can hazard a guess to that one. I think the idea would be "the simpler the mathematical structure, the more often it'd show up as a substructure in other mathematical structures"
For instance, if you are building large random graphs, you'd expect to see some specific pattern of, say, 7 vertices and 18 edges show up as subgraphs more often then, say, some specific pattern of 100 vertices and 2475 edges.
There's a sense in which "reality fluid" could be distributed evenly which would lead to this. If every entire mathematical structure got an equal amount of reality stuff, then small structures would benefit from the reality juice granted to the larger structures that they happen to also exist as substructures of.
EDIT: blargh, corrected big graph edge count. meant to represent half a complete graph.
I really don't see what the problem is. Clearly, the being has "read your mind" and knows what you will do. If you are of the opinion to take both boxes, he knows that from his mind scan, and you are playing right into his hands.
Obviously, your decision cannot affect the outcome because it's already been decided what's in the box, but your BRAIN affected what he put in the box.
It's like me handing you an opaque box and telling you there is $1 million in it if and only if you go and commit murder. Then, you open the box and find it empty. I then o...
I one-box, but not because I haven't considered the two-box issue.
I one-box because it's a win-win in the larger context. Either I walk off with a million dollars, OR I become the first person to outthink Omega and provide new data to those who are following Omega's exploits.
Even without thinking outside the problem, Omega is a game-breaker. We do not, in the problem as stated, have any information on Omega other than that they are superintelligent and may be able to act outside of casuality. Or else Omega is simply a superduperpredictor, to the point wher...
My solution to the problem of the two boxes:
Flip a coin. If heads, both A & B. If tails, only A. (If the superintelligence can predict a coin flip, make it a radioactive decay or something. Eat quantum, Hal.)
In all seriousness, this is a very odd problem (I love it!). Of course two boxes is the rational solution - it's not as if post-facto cogitation is going to change anything. But the problem statement seems to imply that it is actually impossible for me to choose the choice I don't choose, i.e., choice is actually impossible.
Something is absurd here. I suspect it's the idea that my choice is totally predictable. There can be a random element to my choice if I so choose, which kills Omega's plan.
I'm not reading 127 comments, but as a newcomer who's been invited to read this page, along with barely a dozen others, as an introduction, I don't want to leave this unanswered, even though what I have to say has probably already been said.
First of all, the answer to Newcomb's Problem depends a lot on precisely what the problem is. I have seen versions that posit time travel, and therefore backwards causality. In that case, it's quite reasonable to take only one box, because your decision to do so does have a causal effect on the amount in Box B. Presu...
You are disposed to take two boxes. Omega can tell. (Perhaps by reading your comment. Heck, I can tell by reading your comment, and I'm not even a superintelligence.) Omega will therefore not put a million dollars in Box B if it sets you a Newcomb's problem, because its decision to do so depends on whether you are disposed to take both boxes or not, and you are.
I am disposed to take one box. Omega can tell. (Perhaps by reading this comment. I bet you can tell by reading my comment, and I also bet that you're not a superintelligence.) Omega will therefore put a million dollars in Box B if it sets me a Newcomb's problem, because its decision to do so depends on whether I am disposed to take both boxes or not, and I'm not.
If we both get pairs of boxes to choose from, I will get a million dollars. You will get a thousand dollars. I will be monetarily better off than you.
But wait! You can fix this. All you have to do is be disposed to take just Box B. You can do this right now; there's no reason to wait until Omega turns up. Omega does not care why you are so disposed, only that you are so disposed. You can mutter to yourself all you like about how silly the problem is; as long as you wander off with just B under your arm, it will tend to be the case that you end the day a millionaire.
Omega lets me decide to take only one box after meeting Omega, when I have already updated on the fact that Omega exists, and so I have much better knowledge about which sort of god I'm likely to encounter. Upsilon treats me on the basis of a guess I would subjunctively make without knowledge of Upsilon. It is therefore not surprising that I tend to do much better with Omega than with Upsilon, because the relevant choices being made by me are being made with much better knowledge. To put it another way, when Omega offers me a Newcomb's Problem, I will condition my choice on the known existence of Omega, and all the Upsilon-like gods will tend to cancel out into Pascal's Wagers. If I run into an Upsilon-like god, then, I am not overly worried about my poor performance - it's like running into the Christian God, you're screwed, but so what, you won't actually run into one. Even the best rational agents cannot perform well on this sort of subjunctive hypothesis without much better knowledge while making the relevant choices than you are offering them. For every rational agent who performs well with respect to Upsilon there is one who performs poorly with respect to anti-Upsilon....
If its model is good enough that it violates the Second Law as we understand it [...]
It doesn't have to be even remotely close to good enough to that for the scenario. I'd bet a sufficiently good human psychologist could take omega's role and get it 90%+ right if he tests and interviews the people extensively first (without them knowing the purpose) and gets to exclude people he is unsure about. A super intelligent being should be far, far better at this.
You yourself claim to know what you would do in the boxing experiment, and you are an agent limited by conventional physics. There is no physical law that forbids another agent from knowing you as well as (or even better than) you know yourself.
You'll have to explain why you think 99.99% (or whatever) is not good enough, a 0.01% chance to win $ 1000 shouldn't make up for a 99.99% chance of losing $999,000.
If Omega is fallible, then the value of one-boxing falls drastically, and even adjusting the amount of money doesn't help in the end;
Assume Omega has a probability X of correctly predicting your decision:
If you choose to two-box:
If you choose to take box B only:
Your expected utilities for two-boxing and one-boxing are (respectively):
E2 = 1000X + (1-X)1001000
E1 = 1000000X
For E2 > E1, we must have 1000X + 1,001,000 - 1,001,000X - 1,000,000X > 0, or 1,001,000 > 2,000,000X, or
X < 0.5005
So as long as Omega can maintain a greater than 50% accuracy, you should expect to earn more money by one-boxing. Since the solution seems so simple, and since I'm a total novice at decision theory, it's possible I'm missing something here, so please let me know.
There is a good chance I am missing something here, but from an economic perspective this seems trivial:
P(Om) is the probability the person assigns Omega of being able to accurately predict their decision ahead of time.
A. P(Om) x $1m is the expected return from opening one box.
B. (1 - P(Om))x$1m + $1000 is the expected return of opening both boxes (the probability that Omega was wrong times the million plus the thousand.)
Since P(Om) is dependent on people's individual belief about Omega's ability to predict their actions it is not surprising different peop...
Re: "Do you take both boxes, or only box B?"
It would sure be nice to get hold of some more data about the "100 observed occasions so far". If Omega only visits two-boxers - or tries to minimise his outgoings - it would be good to know that. Such information might well be accessible - if we have enough information about Omega to be convinced of his existence in the first place.
What this is really saying is “if something impossible (according to your current theory of the world) actually happens, then rather than insisting it’s impossible and ignoring it, you should revise your theory to say that’s possible”. In this case, the impossible thing is reverse causality; since we are told of evidence that reverse causality has happened in the form of 100 successful previous experiments, we must revise our theory to accept that reverse causality actually can happen. This would lead us to the conclusion that we should take one box. Alter...
You know, I honestly don't even understand why this is a point of debate. One boxing and taking box B (and being the kind of person who will predictably do that) seem so obviously like the rational strategy that it shouldn't even require explanation.
And not obvious in the same way most people think the monty hill problem (game show, three doors, goats behind two, sports-car behind one, ya know?) seems 'obvious' at first.
In the case of the monty hill problem, you play with it, and the cracks start to show up, and you dig down to the surprising truth.
In this case, I don't see how anyone could see and cracks in the first place.
Am I missing something here?
Mr Eliezer, I think you've missed a few points here. However, I've probably missed more. I apologise for errors in advance.
An analogy occurs to me about "regret of rationality."
Sometimes you hear complaints about the Geneva Convention during wartime. "We have to restrain ourselves, but our enemies fight dirty. They're at an advantage because they don't have our scruples!" Now, if you replied, "So are you advocating scrapping the Geneva Convention?" you might get the response "No way. It's a good set of rules, on balance." And I don't think this is an incoherent position: he approves of the rule, but regrets the harm it causes in thi...
"Verbal arguments for one-boxing are easy to come by, what's hard is developing a good decision theory that one-boxes"
First, the problem needs a couple ambiguities resolved, so we'll use three assumptions: A) You are making this decision based on a deterministic, rational philosophy (no randomization, external factors, etc. can be used to make your decision on the box) B) Omega is in fact infallible C) Getting more money is the goal (i.e. we are excluding decision-makers which would prefer to get less money, and other such absurdities)
Changing an...
A way of thinking of this "paradox" that I've found helpful is to see the two-boxer as imagining more outcomes than there actually are. For a payoff matrix of this scenario, the two-boxer would draw four possible outcomes: $0, $1000, $1000000, and $1001000 and would try for $1000 or $1001000. But if Omega is a perfect predictor, than the two that involve it making a mistake ($0 and $1001000) are very unlikely. The one-boxer sees only the two plausible options and goes for $1000000.
It took me a week to think about it. Then I read all the comments, and thought about it some more. And now I think I have this "problem" well in hand. I also think that, incidentally, I arrived at Eliezer's answer as well, though since he never spelled it out I can't be sure.
To be clear - a lot of people have said that the decision depends on the problem parameters, so I'll explain just what it is I'm solving. See, Eliezer wants our decision theory to WIN. That implies that we have all the relevant information - we can think of a lot of situation...
I wanted to consider some truly silly solution. But since taking only box A is out (and I can’t find a good reason for choosing box A, other than a vague argument based in irrationality along the lines that I’d rather not know if omniscience exists…), so I came up with this instead. I won't apologize for all the math-economics, but it might get dense.
Omega has been correct 100 times before, right? Fully intending to take both boxes, I’ll go to each of the 100 other people. There’re 4 categories of people. Let’s assume they aren’t bound by psychology and th...
1) I would one-box. Here's where I think the standard two-boxer argument breaks down. It's the idea of making a decision. The two-boxer idea is that once the boxes have been fixed the course of action that makes the most money is taking both boxes. Unless there is reverse causality going on here, I don't think that anyone disputes this. If at that moment you could make a choice totally independently of everything leading up to that point you would two-box. Unfortunately, the very existence of Omega implies that such a feat is impossible.
2) A mildly s...
Actually I take it back. I think that what I would do depends on what I know of how Omega functions (exactly what evidence lead me to believe that he was good at predicting this).
Omega #1: (and I think this one is the most plausible) You are given a multiple choice personality test (not knowing what's about to happen). You are then told that you are in a Newcomb situation and that Omega's prediction is based on your test answers (maybe they'll even show you Omega's code after the test is over). Here I'll two-box. If I am punished I am not being punish...
"the dominant consensus in modern decision theory is that one should two-box...there's a common attitude that verbal arguments for one-boxing are easy to come by, what's hard is developing a good decision theory that one-boxes"
This may be more a statement about the relevance and utility of decision theory itself as a field (or lack thereof) than the difficulty of the problem, but it is at least philosophically intriguing.
From a physical and computational perspective, there is no paradox, and one need not invoke backwards causality, 'pre-commitmen...
Upon reading this, I immediately went,
"Well, General Relativity includes solutions that have closed timelike curves, and I certainly am not in any position to rule out the possibility of communication by such. So I have no actual reason to rule out the possibility that which strategy I choose will, after I make my decision, be communicated to Omega in my past and then the boxes filled accordingly. So I better one-box in order to choose the closed timelike loop where Omega fills the box."
I understand, looking at Wikipedia, that in Nozick's formu...
The "no backwards causality" argument seems like a case of conflation of correlation and causation. Your decision doesn't retroactively cause Omega to fill the boxes in a certain way; some prior state of the world causes your thought processes and Omega's prediction, and the correlation is exactly or almost exactly 1.
EDIT: Correlation coefficients don't work like that, but whatever. You get what I mean.
The "no backwards causality" argument seems like a case of conflation of correlation and causation. Your decision doesn't retroactively cause Omega to fill the boxes in a certain way; some prior state of the world causes your thought processes and Omega's prediction, and the correlation is exactly or almost exactly 1.
The original description of the problem doesn't mention if you know of Omega's strategy for deciding what to place in box B, or their success history in predicting this outcome - which is obviously a very important factor.
If you know these things, then the only rational choice, obviously and by a huge margin, is to pick only box B.
If you don't know anything other than box B may or may not contain a million dollars, and you have no reasons to believe that it's unlikely, like in the lottery, then the only rational decision is to take both. This also seems to...
You are betting a positive extra payout of $1,000 against a net loss of -$999,000 that there are no Black Swans[1] at all in this situation.
Given that you already have 100 points of evidence that taking Box A makes Box B empty (added to the evidence that Omega is more intelligent than you). I'd say that's a Bad Bet to make.
Given the amount of uncertainty in the world, choosing Box B instead of trying to "beat the system" seems like the rational step to me.
Edit I've given the Math in a comment below to show how to calculate when to make either dec...
How would Newcomb's problem look like in the physical world, taking quantum physics into account? Specifically, would Omega need to know quantum physics in order to predict my decision on "to one box or not to one box"?
To simplify the picture, imagine that Omega has a variable with it that can be either in the state A+B or B and which is expected to correlate with my decision and therefore serves to "predict" me. Omega runs some physical process to arrive at the contents of this variable. I'm assuming that "to predict" means &...
...if you build an AI that two-boxes on Newcomb's Problem, it will self-modify to one-box on Newcomb's Problem, if the AI considers in advance that it might face such a situation. Agents with free access to their own source code have access to a cheap method of precommitment.
...
But what does an agent with a disposition generally-well-suited to Newcomblike problems look like? Can this be formally specified?
...
Rational agents should WIN.
It seems to me that if all that is true, and you want to build a Friendly AI, then the rational thing to do he...
You said:
Causal decision theorists don't self-modify to timeless decision theorists. If you get the decision theory wrong, you can't rely on it repairing itself.
but you also said:
...if you build an AI that two-boxes on Newcomb's Problem, it will self-modify to one-box on Newcomb's Problem, if the AI considers in advance that it might face such a situation.
I can envision several possibilities:
Would you like to clarify?
Causal decision theorists self-modify to one-box on Newcomb's Problem with Omegas that looked at their source code after the self-modification took place; i.e., if the causal decision theorist self-modifies at 7am, it will self-modify to one-box with Omegas that looked at the code after 7am and two-box otherwise. This is not only ugly but also has worse implications for e.g. meeting an alien AI who wants to cooperate with you, or worse, an alien AI that is trying to blackmail you.
Bad decision theories don't necessarily self-repair correctly.
And in general, every time you throw up your hands in the air and say, "I don't know how to solve this problem, nor do I understand the exact structure of the calculation my computer program will perform in the course of solving this problem, nor can I state a mathematically precise meta-question, but I'm going to rely on the AI solving it for me 'cause it's supposed to be super-smart," you may very possibly be about to screw up really damned hard. I mean, that's what Eliezer-1999 thought you could say about "morality".
Sorry if this has already been addressed. I didn't take the time to read all 300 comments.
It seems to me that if there were an omniscient Omega, the world would be deterministic, and you wouldn't have free will. You have the illusion of choice, but your choice is already known by Omega. Hence, try (it's futile) to make your illusory choice a one-boxer.
Personally, I don't believe in determinism or the concept of Omega. This is a nice thought experiment though.
I don't grasp why this problem seems so hard and convoluted. Of course you have to one-box, if you two-box you'll lose for sure. From my perspective two-boxing is irrational...
If Omega can flawlessly predict the future, this confirms a deterministic world at the atomic scale. To be a perfect predictor Omega would also need to have a perfect model of my brain at every stage of making my "decision" - thus Omega can see the future and perfectly predict whether or not I'm gonna two-box or not.
If my brain is wired up in such a way as to choose two-box...
I'm kind of surprised at how complicated everyone is making this, because to me the Bayesian answer jumped out as soon as I finished reading your definition of the problem, even before the first "argument" between one and two boxers. And it's about five sentences long:
Don't choose an amount of money. Choose an expected amount of money--the dollar value multiplied by its probability. One-box gets you >(1,000,000*.99). Two-box gets you <(1,000*1+1,000,000*.01). One-box has superior expected returns. Probability theory doesn't usually encounte...
I see your general point, but it seems like the solution to the Omega example is trivial if Omega is assumed to be able to predict accurately most of the time:
(letting C = Omega predicted correctly; let's assume for simplicity that Omega's fallibility is the same for false positives and false negatives)
It certainly seems like a simple resolution exists...
As a rationalist, there should only ever be one choice you make. It should be the ideal choice. If you are a perfectly rational person, you will only ever make the ideal choice. You are certainly at least, deterministic. If you can make the ideal choice, so can someone else. That means, if someone knows your exact situation (trivial in the Newcomb paradox, as the super intelligent agent is causing your situation) then they can predict exactly what you will do, even without being perfectly rational themse...
Well, for me there are two possible hypothesis for that :
The boxes are not what they seem. For example, box B contains nano-machinery that detects if you one-box or not, create money if you one-box, and then self-destruct the nano-machinery.
Omega is smart enough to be able to predict if I'll one-box or two-box (he scanned my brain, runned it in a simulation, and saw my I do... I hope he didn't turn off the simulation afterwards, or he would have killed "me" then !).
In both cases, I should one-box. So I'll one-box. I don't really get the ra...
It's strange. I perfectly agree with the argument here about rationality - the rationality I want is the rationality that wins, not the rationality that is more reasonable. This agrees with my privileging truth as a leading which is useful, not which necessarily makes the best predictions. But in other points on the site, it always seems that correspondence is privileged over value.
As for Newcombs paradox, I suggest writing out all the relevant propositions a la Jaynes, with non-zero probabilities for all propositions. Make it a real problem, not an ideali...
An amusing n=3 survey of mathematics undergrads at Trinity Cambridge:
1) Refused to answer. 2) It depends on how reliable Omega is/but you cant (shouldn't) really quantify ethics anyway/this situation is unreasonable. 3) Obviously 2 box, one boxing is insane.
3 said he would program an AI to one box. And when I pointed out that his brain was built of quarks just like the AI he responded that in that case free will didn't exist and choice was impossible.
Upvoted for this sentence:
"If it ever turns out that Bayes fails - receives systematically lower rewards on some problem, relative to a superior alternative, in virtue of its mere decisions - then Bayes has to go out the window."
This is such an important concept.
I will say this declaratively: The correct choice is to take only box two. If you disagree, check your premises.
"But it is agreed even among causal decision theorists that if you have the power to precommit yourself to take one box, in Newcomb's Problem, then you should do so. If y...
"If it ever turns out that Bayes fails - receives systematically lower rewards on some problem, relative to a superior alternative, in virtue of its mere decisions - then Bayes has to go out the window."
This is such an important concept.
Yes, but like falsifiability, dangerous. This also goes for 'rationalists win', too.
'We' (Bayesians) face the Duhem-Quine thesis with a vengeance: we have often found situations where Bayes failed. And then we rescued it (we think) by either coming up with novel theses (TDT) or carefully analyzing the problem or a related problem and saying that is the real answer and so Bayes works after all (Jaynes again and again). Have we corrected ourselves or just added epicycles and special pleading? Should we just have tossed Bayes out the window at that point except in the limited areas we already proved it to be optimal or useful?
This can't really be answered.
I think it is important to make a distinction between what our choice is now, while we are here, sitting at a computer screen, unconfronted by Omega, and our choice when actually confronted by Omega. When actually confronted by Omega, your choice has been determined. Take both boxes, take all the money. Right now, sitting in your comfy chair? Take the million-dollar box. In the comfy chair, the contra-factual nature of the experiment basically gives you an Outcome Pump. So take the million-dollar box, because if you take the million-dollar box, it's full of a million dollars. But when it actually happens, the situation is different. You aren't in your comfy chair anymore.
I guess my cognition just breaks down over the idea of Omega. To me, Newcomb's problem seems akin to a theological argument. Either we are talking about a purely theoretical idea that is meant to illustrate abstract decision theory, in which case I don't care how many boxes I take, because it has no bearing on anything tied to reality, or we are actually talking about the real universe, in which case I take both boxes because I don't believe in alien superintelligences capable of foreseeing my choices any more than I believe in an anthropomorphic deity.
If in 35 AD you were told that there were only 100 people who had seen Jesus dead and entombed and then had seen him alive afterwards, and that there were no people who had seen him dead and entombed who had seen his dead body afterwards, would you believe he had been resurrected?
In Newcomb's problem as stated, we are told 100 people have gotten the predicted answer. Then no matter how unlikely our priors put on a superintelligent alien being able to predict what we would do, we should accept this as proof.
This seems like a pretty symmetric question t...
Really? A Phd ? Seriously ?
If Omega said "You shall only take Box B or I will smite thee" and then proceeded to smite a 100 infidels who dared to two box the rational choice would be obvious (especially if the smiting happened after O left)
is this really difficult to show mathematicly ?
This thread has gone a bit cold (are there other ones more active on the same topic?)
My initial thoughts: if you've never heard of Newcomb's problem, and come across it for the first time in real-time, then as soon as you start thinking about it, the only thing to do is 2-box. Yes, Omega will have worked out you'll do that, and you'll only get $1000, but the contents of the boxes are already set. It's too late to convince Onega that you're going to 1 box.
On the other hand, if you have already heard and thought about the problem, the rational thing to do is...
This is an old thread, but I can't imagine the problem going away anytime soon, so let me throw some chum into the waters;
Omega says; "I predict you're a one boxer. I can understand that. You've got really good reasons for picking that, and I know you would never change your mind. So I'm going to give you a slightly different version of the problem; I've decided to make both boxes transparent. Oh and by the way, my predictions aren't 100% correct."
Question: Do you make any different decisions in the transparent box case?
If so, what was there ...
Box B is already empty or already full [and will remain the same after I've picked it]
Do I have to believe that statement is completely and utterly true for this to be a meaningful exercise? It seems to me that I should treat that as dubious.
It seems to me that Omega is achieving a high rate of success by some unknown good method. If I believe Omega's method is a hard-to-detect remote-controlled money vaporisation process then clearly I should one-box.
A super intelligence has many ways to get the results it wants.
I am inclined to think that I don't kno...
This reminds me eerily of the Calvinist doctrine of predestination. The money is already there, and making fun of me for two-boxing ain't gonna change anything.
A question - how could Omega be a perfect predictor, if I in fact have a third option - namely leaving without taking either box? This possibility would, in any real-life situation, lead me to two-box. I know this and accept it.
Then there's always the economic argument: If $1000 is a sum of money that matters a great deal to me, I'm two-boxing. Otherwise, I'd prefer to one-box.
So, I'm sure this isn't an original thought but there are a lot of comments and my utility function is rolling its eyes at the thought of going through them all to see whether this comment is redundant, as compared to writing the comment given I want to sort my thoughts out verbally anyway.
I think the standard form of the question should be changed to the one with the asteroid. Total destruction is total destruction, but money is only worth a) what you can buy with it and b) the effort it takes to earn it.
I can earn $1000 in a month. Some people could earn...
I'd just like to note that as with most of the rationality material in Eliezer's sequences, the position in this post is a pretty common mainstream position among cognitive scientists. E.g. here is Jonathan Baron on page 61 of his popular textbook Thinking and Deciding:
...the best kind of thinking, which we shall call rational thinking, is whatever kind of thinking best helps people achieve their goals. If it should turn out that following the rules of formal logic leads to eternal happiness, then it is rational thinking to follow the laws of logic (assumin
It seems to me that if you make a basic bayes net with utilities at the end. The choice with the higher expected utility is to one box.
Say:
P(1,000,000 in box b and 10,000 in box a|I one box) = 99%
P(box b is empty and 10,000 in box a|I two box) = 99%
hence
P(box b is empty and 10,000 in box a|I one box) = 1%
P(1,000,000 in box b and 10,000 in box a|I two box) = 1%
So
If I one box i should expect 99%1,000,000+1%0 = 990,000
If I two box i should expect 99%10,000+1%1,010,000 = 20,000
Expected utility(I one box)/Expected utility(I two box) = 49.5, so I should one bo...
I think you went wrong when you said:
...Next, let's turn to the charge that Omega favors irrationalists. I can conceive of a superbeing who rewards only people >born with a particular gene, regardless of their choices. I can conceive of a superbeing who rewards people whose >brains inscribe the particular algorithm of "Describe your options in English and choose the last option when ordered >alphabetically," but who does not reward anyone who chooses the same option for a different reason. But Omega >rewards people who choose to tak
Sorry, I'm new here, I am having trouble with the Idea that anyone would consider taking both boxes in a real world situation. How would this puzzle be modeled differently, versus how would it look differently if it were Penn and Teller flying Omega?
If Penn and Teller were flying Omega then they would have been able to produce exactly the same results as seen, without violating causality or time travelling or perfectly predicting people by just cheating and emptying the box after you choose to take both.
Given that "it's cheating" is a significant...
I've always been a one-boxer. I think I have a new solution as to why. Try this: Scenario A: you will take a sleep potion and be woken up twice during the middle of the night to be asked to take one box or both boxes. Whatever you do the first time determines whether $1m is placed in the potentially-empty box. Whatever you do the second time determines what you collect. The catch is that the sleep potion will wipe all your memories over the next twelve hours. You're told this in advance and asked to make up your mind. So you'll give the same answer each ti...
The solution to this problem is simple and, in my eyes, pretty obvious. Your decision isn't changing the past, it's simply that the choice of Omega and your decision have the same cause. Assuming Omega emulates your mind under the conditions of when you're making the choice, then the cause of the prediction and the cause of your choice are the same (the original state of your mind is the cause). So choosing B is the rational choice. And anyways, no matter what method of prediction Omega uses, the cause of his prediction will always be the same as the cause...
It seems to me that no rationalist should accept the 'givens' in this scenario without a lot of evidence.
So what am I left with. Some being who hands out boxes, and 100 examples of people who open 1 box and get $1M or open both boxes and get $1k. I am unwilling to accept on faith a super-intelligent alien, so I will make the simplifying assumption that the being is in fact Penn & Teller. In which case, the question simplifies to "Am I willing to bet at 1000:1 odds that Penn & Teller aren't able to make a box which vanishes $1M if I choose ...
"You shouldn't find yourself distinguishing the winning choice from the reasonable choice."
I disagree. Let's say there's box A with $1000 dollars in it, and box B with $10,000 in it 1% of the time, and you can only pick one. If i pick A and my friend picks B, and they get the $10,000, they might say to me that I should wish I was like them. But I'll defend my choice as reasonable, even though it wasn't the winning choice that time.
I'm confused about why this problem is different from other decision problems.
Given the problem statement, this is not an acausal situation. No physics is being disobeyed - Kramers Kronig still works, relativity still works. It's completely reasonable that my choice could be predicted from my source code. Why isn't this just another example of prior information being appropriately applied to a decision?
Am I dodging the question? Does EY's new decision theory account for truly acausal situations? If I based my decision on the result of, say, a radioactive decay experiment performed after Omega left, could I still optimize?
I've been fiddling around with this in my head. I arrived at this argument for one-boxing: Let us suppose a Rule, that we shall call W: FAITHFULLY FOLLOW THE RULE THAT, IF FOLLOWED FAITHFULLY, WILL ALWAYS OFFER THE GREATEST CHANCE OF THE GREATEST UTILITY To prove W one boxes, let us list all logical possibilities, which we'll call W1 W2 and W3: W1 always one-boxing W2 always two boxing, and W3 sometimes one-boxing and sometimes two boxing. Otherwise, all of these rules are identical in every way, and identical to W in every way. Imagining that we're Omega...
I hope I'm not being redundant, but... The common argument I've seen is that it must be backward causation if one boxing predictably comes out with more money than two boxing.
Why can't it just be that Omega is really, really good at cognitive psychology, has a complete map of your brain, and is able to use that to predict your decision so well that the odds of Omega's prediction being wrong are epsilon? This just seemed... well, obvious to me. But most people arguing "backward causation!" seem to be smarter than me.
The possibilities I see are eit...
Now perhaps I am misunderstanding the problem. Are we to assume that all this is foreknowledge?
Given the information present in this article I would just choose to take only B. But that is assuming that Omega is never wrong. Logic in my own mind dictates that regardless of why I chose B, or if I at some earlier point may have Two-Boxed, at this time I choose box B, and if Omega's prediction is never wrong- then if I choose B, B will contain a million dollars.
Now in an alternate itteration of this dilemna, regardless of the truth (whiether Omega is indeed n...
It's conforting sometimes to read from someone else that rationality is not the looser's way, and arguably more so for Prisonner's Dilemma than Newcomb's if your consider the current state of our planet and the tragedy of commons.
I'm writing this because I believe I suceeded writing a computer program (it is so simple I can't call it an AI) able to actually simulate Omega in a Newcomb game. What I describe below may look like an iterated Newcomb's problem. But I claim it is not so and will explain why.
When using my program the human player will actually b...
It seems to me that the entire discussion is confused. Many people seem to be using the claim that Omega can't predict your actions to make claims about what actions to take in the hypothetical world where it can. Accepting the assumption that Omega can predict your actions the problem seems to be a trivial calculation of expected utility:
If the opaque box contains b1 utility, the transparent one b2 utility, omega has e1 probability of falsly predicting you'll one box and e2 probability of falsely predicting you'll two box the expected utilities are
1 box: (1-e2)b1 2 box: e1b1 + b2
And you should 1 box unless b2 is bigger than (1 - e2 - e1)*b1.
I choose Box B. This is because taking into account that Omega is a superintelligence with a success rate of 100% and no margin of error and is the one offering the problem to me. The only logical reason for this is an ability to predict variables that I have no current understanding of. This is either through an ability to analyze my psyche and see my tendency to trust in things with 100% success rates, the ability to foresee in time my decision, or the ability for Omega to affect things backwards in time. Omega has not provided any reasoning for its 100%...
Seems like a simple and reasonable answer to this problem is that I would take the box with the million dollars, rather than the box with the thousand dollars and the empty box. It seems the main question is, "But why?". So here is my reasoning: Omega has shown over 99% accuracy in providing results dependent on people's choices. Box B has 100,000% better rewards than Box A, such that if there is even 0.1% chance that taking Box A will lose those rewards, it is irrational to also take Box A. As I have seen no evidence that Omega has left, it is n...
I'm not sure if anyone's noticed this, but how do you know that you're not a simulation of yourself inside Omega? If he is superintelligent, he would compute your decision by simulating you, and you and your simulation will be indisinguishable.
This is fairly obviously a PD against said simulation -- if you cooperate in PD, you should one-box.
I personally am not sure, although if I had to decide I'd probably one-box
I suspect that this is very simple. Similar to the tree in the forest problem that Eliezer wrote about, if you ask about concrete variations of this question, the right choice is obvious.
One question is what to do when the boxes are in front of you.
Thinking about this in terms of AGI, would it be reasonable to suggest that a bias must be created in favor of utilizing inductive reasoning through Bayes' Theorem rather than deductive reasoning when and if the two conflict?
Maybe I'm missing something (I'm new to Bayes), but I honestly don't see how any of this is actually a problem. I may just be repeating Yudkowsky's point, but... Omega is a superintelligence, who is right in every known prediction. This means, essentially, that he looks at you and decides what you'll do, and he's right 100 out of 100 times. So far, a perfect rate. He's probably not going to mess up on you. If you're not trying to look at this with CDT, the answer is obvious: take box B. Omega knows you'll do that and you'll get the million. It's not about the result changing after the boxes are put down, it's about predictions about a person.
Oddly, this problem seems (to my philosopher/engineer mind) to have an exceedingly non-complex solution, and it depends not upon the chooser but upon Omega.
Here's the payout schema assumed by the two-boxer, for reference: 1) Both boxes predicted, both boxes picked: +$1,000 2) Both boxes predicted, only B picked: $0 3) Only B predicted, both boxes picked: +$1,001,000 4) Only B predicted, only B picked: +$1,000,000
Omega, being an unknowable superintelligence, qualifies as a force of nature from our current level of human understanding. Since...
As I understand it, most types of decision theory (including game theory) assume that all agents have about the same intelligence and that this intelligence is effectively infinite (or at least large enough so everyone has a complete understanding of the mathematical implications of the relevant utility functions).
In Newcomb's problem, one of the players is explicitly defined as vastly more intelligent than the other.
In any situation where someone might be really good at predicting your thought processes, its best to add some randomness to your actions. Th...
Here is my answer to Newcomb's problem:
Omega doesn't exist in reality. Therefore Newcomb's problem is irrelevant and I don't waste time thinking about it.
I wonder how many people come up with this answer. Most of them are probably smarter than me and also don't waste time commenting their opinion.
Am I missing something?
The way I see it, causal decision theory simply ignores a part of the problem: that the Predictor is able to "predict".
Evidence should get inside the equation, but not the same way as evidential decision theory: the evidence is what should fuel the hypothesis "The Predictor predicts our choices".
It does not matter if we "think" that our "choice" shouldn't change what's inside the boxes - as the main thing about a prediction is that we aren't actually making any "choice", that "choice" is already ...
It seems like the 'rational' two boxers are falling prey to the concept of belief in belief. They think that because they believe that they are people who would choose both boxes, than it doesn't matter what they choose, box B is already empty so they may as well take both. If you have all the information (except for what is in box B), than choosing both is the irrational option and the 'rational' people are rationalizing. You've just seen someone (or something) materialize two boxes from thin air, tell you they know which option you'll choose (and have ev...
Rational agents should WIN.
This reminds me of these great new US Army ads: https://youtu.be/jz3e2_CyOi8
It feels like decision theory is subject to the halting problem. Sketching some rough thoughts.
Consider your particular decision theory as a black box function or set of rules F which take the description of a situation P and outputs yes or no and one of those answers wins, the other loses.
F(P)
You want a decision theory, some set of rules to follow F which wins in all situation.
But for all F it's possible to construct a situation P "The winning situation is !F(P)", feeding F into itself. (or a simplified equivalent)
No matter what set of rules you include in your decision theory it cannot win in all cases. Ever.
I two-box.
Three days later, "Omega" appears in the sky and makes an announcement. "Greeting earthlings. I am sorry to say that I have lied to you. I am actually Alpha, a galactic superintelligence who hates that Omega asshole. I came to predict your species' reaction to my arch-nemesis Omega and I must say that I am disappointed. So many of you chose the obviously-irrational single-box strategy that I must decree your species unworthy of this universe. Goodbye."
Giant laser beam then obliterates earth. I die wishing I'd done more ...
I would play lotto: if I win more than 10M$, I take the black box and leave. Otherwise I'd look in the black box: if it is full, I also take the small one. If not, I leave with just the empty black box. As this should be inconsistent, assuming a time traveling Omega, it would either make him not choose me for his experiment or let me win for sure (assuming time works in similar ways as in HPMOR). If I get nothing, it would prove the Omega wrong (and tell me quite a bit about how the Omega (and time) works). If his prediction was correct though, I win 11.000.000$, which is way better than either 'standard' variant.
What if there was an asteroid rushing toward Earth, and box A contained an asteroid deflector that worked 10% of the time, and box B might contain an asteroid deflector that worked 100% of the time?
I'd change that to 95%, because if B contains a 100% deflector, A adds nothing and there's no dilemma.
Do write the PhD thesis and get the PhD whose lack makes you complain a bit too often)))
On a more serious note - same thing as Musashi says is all too often said about chess (always think how to make a checkmate). And in both cases it seems to be a heuristics at best. We do not have the chess programming the best chess-playing computers have (nor the fencing one). And we do seem to be able to think about next steps better than the steps after them. So it seems plausible that sometimes we are to forget the enemy king/body and defend our own, for we, being imperfect, will lose ours otherwise well before getting to the enemy.
This post should be updated to link to Functional Decision Theory, now that it has been written up.
I think a major determinant of the choice here depends on whether or not the "chooser" knows about the previous results. If you know that in the previous scenarios, people who choose only one box win, then by all means, choosing only one box is the rational decision.
If you don't have this prior information, then choosing both boxes seems more rational.
slighly modified version:
Instead of chosing at once whether you want to take one box or both boxes, you first take box 1 (and see whether it includes 0$ or 1.000.000$), and then, you decide whether you want to also take box 2.
Assume that you only care about the money, you don't care about doing the opposite of what Omega predicted.
I can't claim to be particularly versed in the debates about Newcomb’s paradox, so I might be wrong here, but it seems to me like you got Joyce’s argument precisely backwards. His entire point seems to be that Rachel and Irene are in fact not facing the same options.
Irene has the options
Rachel has the options
This is the way I think about it:
Given how good Omega is at predicting people's decisions, I should assume that a world where I choose to take both boxes cannot coincide with a world where Omega predicted I would only take one box. In other words, the payoff matrix that creates this paradox in the first place is an illusion, because the scenario in which you two-box and get $1,001,000 simply doesn't exist. Or at the very least, it is so unlikely to exist that you should behave as though it doesn't.
Apologies if this argument has been made before - I’ve had a quick scan through the comments and can’t see it so here goes: The rational choice is to one-box. The two-boxers are throwing away a critical piece of evidence: in 100 cases out of 100 so far, one-boxing is the right strategy. Therefore, based upon the observable evidence, there’s a less than 1% chance of two-boxing being the correct strategy. It’s irrational to argue that you should two-box. This argument maps on to the real world. In the real world you are never certain about the mechanism behi...
The following may well be the most controversial dilemma in the history of decision theory:
And the standard philosophical conversation runs thusly:
There is a large literature on the topic of Newcomblike problems - especially if you consider the Prisoner's Dilemma as a special case, which it is generally held to be. "Paradoxes of Rationality and Cooperation" is an edited volume that includes Newcomb's original essay. For those who read only online material, this PhD thesis summarizes the major standard positions.
I'm not going to go into the whole literature, but the dominant consensus in modern decision theory is that one should two-box, and Omega is just rewarding agents with irrational dispositions. This dominant view goes by the name of "causal decision theory".
As you know, the primary reason I'm blogging is that I am an incredibly slow writer when I try to work in any other format. So I'm not going to try to present my own analysis here. Way too long a story, even by my standards.
But it is agreed even among causal decision theorists that if you have the power to precommit yourself to take one box, in Newcomb's Problem, then you should do so. If you can precommit yourself before Omega examines you; then you are directly causing box B to be filled.
Now in my field - which, in case you have forgotten, is self-modifying AI - this works out to saying that if you build an AI that two-boxes on Newcomb's Problem, it will self-modify to one-box on Newcomb's Problem, if the AI considers in advance that it might face such a situation. Agents with free access to their own source code have access to a cheap method of precommitment.
What if you expect that you might, in general, face a Newcomblike problem, without knowing the exact form of the problem? Then you would have to modify yourself into a sort of agent whose disposition was such that it would generally receive high rewards on Newcomblike problems.
But what does an agent with a disposition generally-well-suited to Newcomblike problems look like? Can this be formally specified?
Yes, but when I tried to write it up, I realized that I was starting to write a small book. And it wasn't the most important book I had to write, so I shelved it. My slow writing speed really is the bane of my existence. The theory I worked out seems, to me, to have many nice properties besides being well-suited to Newcomblike problems. It would make a nice PhD thesis, if I could get someone to accept it as my PhD thesis. But that's pretty much what it would take to make me unshelve the project. Otherwise I can't justify the time expenditure, not at the speed I currently write books.
I say all this, because there's a common attitude that "Verbal arguments for one-boxing are easy to come by, what's hard is developing a good decision theory that one-boxes" - coherent math which one-boxes on Newcomb's Problem without producing absurd results elsewhere. So I do understand that, and I did set out to develop such a theory, but my writing speed on big papers is so slow that I can't publish it. Believe it or not, it's true.
Nonetheless, I would like to present some of my motivations on Newcomb's Problem - the reasons I felt impelled to seek a new theory - because they illustrate my source-attitudes toward rationality. Even if I can't present the theory that these motivations motivate...
First, foremost, fundamentally, above all else:
Rational agents should WIN.
Don't mistake me, and think that I'm talking about the Hollywood Rationality stereotype that rationalists should be selfish or shortsighted. If your utility function has a term in it for others, then win their happiness. If your utility function has a term in it for a million years hence, then win the eon.
But at any rate, WIN. Don't lose reasonably, WIN.
Now there are defenders of causal decision theory who argue that the two-boxers are doing their best to win, and cannot help it if they have been cursed by a Predictor who favors irrationalists. I will talk about this defense in a moment. But first, I want to draw a distinction between causal decision theorists who believe that two-boxers are genuinely doing their best to win; versus someone who thinks that two-boxing is the reasonable or the rational thing to do, but that the reasonable move just happens to predictably lose, in this case. There are a lot of people out there who think that rationality predictably loses on various problems - that, too, is part of the Hollywood Rationality stereotype, that Kirk is predictably superior to Spock.
Next, let's turn to the charge that Omega favors irrationalists. I can conceive of a superbeing who rewards only people born with a particular gene, regardless of their choices. I can conceive of a superbeing who rewards people whose brains inscribe the particular algorithm of "Describe your options in English and choose the last option when ordered alphabetically," but who does not reward anyone who chooses the same option for a different reason. But Omega rewards people who choose to take only box B, regardless of which algorithm they use to arrive at this decision, and this is why I don't buy the charge that Omega is rewarding the irrational. Omega doesn't care whether or not you follow some particular ritual of cognition; Omega only cares about your predicted decision.
We can choose whatever reasoning algorithm we like, and will be rewarded or punished only according to that algorithm's choices, with no other dependency - Omega just cares where we go, not how we got there.
It is precisely the notion that Nature does not care about our algorithm, which frees us up to pursue the winning Way - without attachment to any particular ritual of cognition, apart from our belief that it wins. Every rule is up for grabs, except the rule of winning.
As Miyamoto Musashi said - it's really worth repeating:
(Another example: It was argued by McGee that we must adopt bounded utility functions or be subject to "Dutch books" over infinite times. But: The utility function is not up for grabs. I love life without limit or upper bound: There is no finite amount of life lived N where I would prefer a 80.0001% probability of living N years to an 0.0001% chance of living a googolplex years and an 80% chance of living forever. This is a sufficient condition to imply that my utility function is unbounded. So I just have to figure out how to optimize for that morality. You can't tell me, first, that above all I must conform to a particular ritual of cognition, and then that, if I conform to that ritual, I must change my morality to avoid being Dutch-booked. Toss out the losing ritual; don't change the definition of winning. That's like deciding to prefer $1000 to $1,000,000 so that Newcomb's Problem doesn't make your preferred ritual of cognition look bad.)
"But," says the causal decision theorist, "to take only one box, you must somehow believe that your choice can affect whether box B is empty or full - and that's unreasonable! Omega has already left! It's physically impossible!"
Unreasonable? I am a rationalist: what do I care about being unreasonable? I don't have to conform to a particular ritual of cognition. I don't have to take only box B because I believe my choice affects the box, even though Omega has already left. I can just... take only box B.
I do have a proposed alternative ritual of cognition which computes this decision, which this margin is too small to contain; but I shouldn't need to show this to you. The point is not to have an elegant theory of winning - the point is to win; elegance is a side effect.
Or to look at it another way: Rather than starting with a concept of what is the reasonable decision, and then asking whether "reasonable" agents leave with a lot of money, start by looking at the agents who leave with a lot of money, develop a theory of which agents tend to leave with the most money, and from this theory, try to figure out what is "reasonable". "Reasonable" may just refer to decisions in conformance with our current ritual of cognition - what else would determine whether something seems "reasonable" or not?
From James Joyce (no relation), Foundations of Causal Decision Theory:
It is, I would say, a general principle of rationality - indeed, part of how I define rationality - that you never end up envying someone else's mere choices. You might envy someone their genes, if Omega rewards genes, or if the genes give you a generally happier disposition. But Rachel, above, envies Irene her choice, and only her choice, irrespective of what algorithm Irene used to make it. Rachel wishes just that she had a disposition to choose differently.
You shouldn't claim to be more rational than someone and simultaneously envy them their choice - only their choice. Just do the act you envy.
I keep trying to say that rationality is the winning-Way, but causal decision theorists insist that taking both boxes is what really wins, because you can't possibly do better by leaving $1000 on the table... even though the single-boxers leave the experiment with more money. Be careful of this sort of argument, any time you find yourself defining the "winner" as someone other than the agent who is currently smiling from on top of a giant heap of utility.
Yes, there are various thought experiments in which some agents start out with an advantage - but if the task is to, say, decide whether to jump off a cliff, you want to be careful not to define cliff-refraining agents as having an unfair prior advantage over cliff-jumping agents, by virtue of their unfair refusal to jump off cliffs. At this point you have covertly redefined "winning" as conformance to a particular ritual of cognition. Pay attention to the money!
Or here's another way of looking at it: Faced with Newcomb's Problem, would you want to look really hard for a reason to believe that it was perfectly reasonable and rational to take only box B; because, if such a line of argument existed, you would take only box B and find it full of money? Would you spend an extra hour thinking it through, if you were confident that, at the end of the hour, you would be able to convince yourself that box B was the rational choice? This too is a rather odd position to be in. Ordinarily, the work of rationality goes into figuring out which choice is the best - not finding a reason to believe that a particular choice is the best.
Maybe it's too easy to say that you "ought to" two-box on Newcomb's Problem, that this is the "reasonable" thing to do, so long as the money isn't actually in front of you. Maybe you're just numb to philosophical dilemmas, at this point. What if your daughter had a 90% fatal disease, and box A contained a serum with a 20% chance of curing her, and box B might contain a serum with a 95% chance of curing her? What if there was an asteroid rushing toward Earth, and box A contained an asteroid deflector that worked 10% of the time, and box B might contain an asteroid deflector that worked 100% of the time?
Would you, at that point, find yourself tempted to make an unreasonable choice?
If the stake in box B was something you could not leave behind? Something overwhelmingly more important to you than being reasonable? If you absolutely had to win - really win, not just be defined as winning?
Would you wish with all your power that the "reasonable" decision was to take only box B?
Then maybe it's time to update your definition of reasonableness.
Alleged rationalists should not find themselves envying the mere decisions of alleged nonrationalists, because your decision can be whatever you like. When you find yourself in a position like this, you shouldn't chide the other person for failing to conform to your concepts of reasonableness. You should realize you got the Way wrong.
So, too, if you ever find yourself keeping separate track of the "reasonable" belief, versus the belief that seems likely to be actually true. Either you have misunderstood reasonableness, or your second intuition is just wrong.
Now one can't simultaneously define "rationality" as the winning Way, and define "rationality" as Bayesian probability theory and decision theory. But it is the argument that I am putting forth, and the moral of my advice to Trust In Bayes, that the laws governing winning have indeed proven to be math. If it ever turns out that Bayes fails - receives systematically lower rewards on some problem, relative to a superior alternative, in virtue of its mere decisions - then Bayes has to go out the window. "Rationality" is just the label I use for my beliefs about the winning Way - the Way of the agent smiling from on top of the giant heap of utility. Currently, that label refers to Bayescraft.
I realize that this is not a knockdown criticism of causal decision theory - that would take the actual book and/or PhD thesis - but I hope it illustrates some of my underlying attitude toward this notion of "rationality".
You shouldn't find yourself distinguishing the winning choice from the reasonable choice. Nor should you find yourself distinguishing the reasonable belief from the belief that is most likely to be true.
That is why I use the word "rational" to denote my beliefs about accuracy and winning - not to denote verbal reasoning, or strategies which yield certain success, or that which is logically provable, or that which is publicly demonstrable, or that which is reasonable.
As Miyamoto Musashi said: