Alicorn comments on Newcomb's Problem and Regret of Rationality - Less Wrong

64 Post author: Eliezer_Yudkowsky 31 January 2008 07:36PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (588)

Sort By: Old

You are viewing a single comment's thread. Show more comments above.

Comment author: TobyBartels 22 July 2010 06:00:21AM *  1 point [-]

I'm not reading 127 comments, but as a newcomer who's been invited to read this page, along with barely a dozen others, as an introduction, I don't want to leave this unanswered, even though what I have to say has probably already been said.

First of all, the answer to Newcomb's Problem depends a lot on precisely what the problem is. I have seen versions that posit time travel, and therefore backwards causality. In that case, it's quite reasonable to take only one box, because your decision to do so does have a causal effect on the amount in Box B. Presumably causal decision theorists would agree.

However, in any version of the problem where there is no clear evidence of violations of currently known physics and where the money has been placed by Omega before my decisions, I am a two-boxer. Yet I think that your post above must not be talking about the same problem that I am thinking of, especially at the end. Although you never said so, it seems to me that you must be talking about a problem which says "If you choose Box B, then it will have a million dollars; if you choose both boxes, then Box B will be empty.". But that is simply not what the facts will be if Omega has made the decision in the past and currently understood physics applies. In the problem as stated, Omega may make mistakes in the future, and that makes all the difference.

It's presumptuous of me to assume that you're talking about a different problem from the one that you stated, I know. But as I read the psychological states that you suggest that I might have —that I might wish that I considered one-boxing rational, for example—, they seem utterly insane. Why would I wish such a thing? What does it have to do with anything? The only thing that I can wish for is that Omega has predicted that I will be a one-boxer, which has nothing to do with what I consider rational now.

The quotation from Joyce explains it well, up until the end, where poor phrasing may have confused you. The last sentence should read:

When Rachel wishes she was Irene's type she is wishing for Irene's circumstances, not wishing to make Irene's choice.

It is simply not true that Rachel envies Irene's choice. Rachel envies Irene's situation, the situation where there is a million dollars in Box B. And if Rachel were in that situation, then she would still take both boxes! (At least if I understand Joyce correctly.)

Possibly one thing that distinguishes me from one-boxers, and maybe even most two-boxers, is that I understand fundamental physics rather thoroughly and my prior has a very strong presumption against backwards causality. The mere fact that Omega has made successful predictions about Newcomb's Paradox will never be enough to overrule that. Even being superintelligent and coming from another galaxy is not enough, although things change if Omega (known to be superintelligent and honest) claims to be a time-traveller. Perhaps for some one-boxers, and even for some irrational two-boxers, Omega's past success at prediction is good evidence for backwards causality, but not for me.

So suppose that somebody puts two boxes down before me, presents convincing evidence for the situation as you stated it above (but no more), and goes away. Then I will simply take all of the money that this person has given me: both boxes. Before I open them, I will hope that they predicted that I will choose only one. After I open them, if I find Box B empty, then I will wish that they had predicted that I would choose only one. But I will not wish that I had chosen only one. And I certainly will not hope, beforehand, that I will choose only one and yet nevertheless choose two; that would indeed be irrational!

Comment author: Alicorn 22 July 2010 06:08:14AM 10 points [-]

You are disposed to take two boxes. Omega can tell. (Perhaps by reading your comment. Heck, I can tell by reading your comment, and I'm not even a superintelligence.) Omega will therefore not put a million dollars in Box B if it sets you a Newcomb's problem, because its decision to do so depends on whether you are disposed to take both boxes or not, and you are.

I am disposed to take one box. Omega can tell. (Perhaps by reading this comment. I bet you can tell by reading my comment, and I also bet that you're not a superintelligence.) Omega will therefore put a million dollars in Box B if it sets me a Newcomb's problem, because its decision to do so depends on whether I am disposed to take both boxes or not, and I'm not.

If we both get pairs of boxes to choose from, I will get a million dollars. You will get a thousand dollars. I will be monetarily better off than you.

But wait! You can fix this. All you have to do is be disposed to take just Box B. You can do this right now; there's no reason to wait until Omega turns up. Omega does not care why you are so disposed, only that you are so disposed. You can mutter to yourself all you like about how silly the problem is; as long as you wander off with just B under your arm, it will tend to be the case that you end the day a millionaire.

Comment author: cousin_it 22 July 2010 06:58:56AM *  6 points [-]

Sometime ago I figured out a refutation of this kind of reasoning in Counterfactual Mugging, and it seems to apply in Newcomb's Problem too. It goes as follows:

Imagine another god, Upsilon, that offers you a similar two-box setup - except to get the $2M in the box B, you must be a one-boxer with regard to Upsilon and a two-boxer with regard to Omega. (Upsilon predicts your counterfactual behavior if you'd met Omega instead.) Now you must choose your dispositions wisely because you can't win money from both gods. The right disposition depends on your priors for encountering Omega or Upsilon, which is a "bead jar guess" because both gods are very improbable. In other words, to win in such problems, you can't just look at each problem individually as it arises - you need to have the correct prior/predisposition over all possible predictors of your actions, before you actually meet any of them. Obtaining such a prior is difficult, so I don't really know what I'm predisposed to do in Newcomb's Problem if I'm faced with it someday.

Comment author: Alicorn 22 July 2010 07:08:04AM 0 points [-]

Something seems off about this, but I'm not sure what.

Comment author: cousin_it 22 July 2010 07:10:30AM *  0 points [-]

I'm pretty sure the logic is correct. I do make silly math mistakes sometimes, but I've tested this one on Vladimir Nesov and he agrees. No comment from Eliezer yet (this scenario was first posted to decision-theory-workshop).

Comment author: Alicorn 22 July 2010 07:11:52AM 1 point [-]

It reminds me vaguely of Pascal's Wager, but my cached responses thereunto are not translating informatively.

Comment author: cousin_it 22 July 2010 07:14:45AM *  1 point [-]

Then I think the original Newcomb's Problem should remind you of Pascal's Wager just as much, and my scenario should be analogous to the refutation thereof. (Thereunto? :-)

Comment author: Vladimir_Nesov 22 July 2010 07:17:55AM *  4 points [-]

This is not a refutation, because what you describe is not about the thought experiment. In the thought experiment, there are no Upsilons, and so nothing to worry about. It is if you face this scenario in real life, where you can't be given guarantees about the absence of Upsilons, that your reasoning becomes valid. But it doesn't refute the reasoning about the thought experiment where it's postulated that there are no Upsilons.

(Original thread, my discussion.)

Comment author: cousin_it 22 July 2010 07:35:46AM *  0 points [-]

Thanks for dropping the links here. FWIW, I agree with your objection. But at the very least, the people claiming they're "one-boxers" should also make the distinction you make.

Also, user Nisan tried to argue that various Upsilons and other fauna must balance themselves out if we use the universal prior. We eventually took this argument to email, but failed to move each other's positions.

Comment author: Vladimir_Nesov 22 July 2010 07:39:07AM *  0 points [-]

Just didn't want you confusing people or misrepresenting my opinion, so made everything clear. :-)

Comment author: toto 22 July 2010 09:16:49AM 0 points [-]

OK. I assume the usual (Omega and Upsilon are both reliable and sincere, I can reliably distinguish one from the other, etc.)

Then I can't see how the game doesn't reduce to standard Newcomb, modulo a simple probability calculation, mostly based on "when I encounter one of them, what's my probability of meeting the other during my lifetime?" (plus various "actuarial" calculations).

If I have no information about the probability of encountering either, then my decision may be incorrect - but there's nothing paradoxical or surprising about this, it's just a normal, "boring" example of an incomplete information problem.

you need to have the correct prior/predisposition over all possible predictors of your actions, before you actually meet any of them.

I can't see why that is - again, assuming that the full problem is explained to you on encountering either Upsilon or Omega, both are truhful, etc. Why can I not perform the appropriate calculations and make an expectation-maximising decision even after Upsilon-Omega has left? Surely Omega-Upsilon can predict that I'm going to do just that and act accordingly, right?

Comment author: cousin_it 22 July 2010 09:22:20AM *  0 points [-]

Yes, this is a standard incomplete information problem. Yes, you can do the calculations at any convenient time, not necessarily before meeting Omega. (These calculations can't use the information that Omega exists, though.) No, it isn't quite as simple as you state: when you meet Omega, you have to calculate the counterfactual probability of you having met Upsilon instead, and so on.

Comment author: Eliezer_Yudkowsky 23 July 2010 12:16:06AM 9 points [-]

Omega lets me decide to take only one box after meeting Omega, when I have already updated on the fact that Omega exists, and so I have much better knowledge about which sort of god I'm likely to encounter. Upsilon treats me on the basis of a guess I would subjunctively make without knowledge of Upsilon. It is therefore not surprising that I tend to do much better with Omega than with Upsilon, because the relevant choices being made by me are being made with much better knowledge. To put it another way, when Omega offers me a Newcomb's Problem, I will condition my choice on the known existence of Omega, and all the Upsilon-like gods will tend to cancel out into Pascal's Wagers. If I run into an Upsilon-like god, then, I am not overly worried about my poor performance - it's like running into the Christian God, you're screwed, but so what, you won't actually run into one. Even the best rational agents cannot perform well on this sort of subjunctive hypothesis without much better knowledge while making the relevant choices than you are offering them. For every rational agent who performs well with respect to Upsilon there is one who performs poorly with respect to anti-Upsilon.

On the other hand, beating Newcomb's Problem is easy, once you let go of the idea that to be "rational" means performing a strange ritual cognition in which you must only choose on the basis of physical consequences and not on the basis of correct predictions that other agents reliably make about you, so that (if you choose using this bizarre ritual) you go around regretting how terribly "rational" you are because of the correct predictions that others make about you. I simply choose on the basis of the correct predictions that others make about me, and so I do not regret being rational.

And these questions are highly relevant and realistic, unlike Upsilon; in the future we can expect there to be lots of rational agents that make good predictions about each other.

Comment author: cousin_it 23 July 2010 08:49:02AM *  0 points [-]

Pascal's Wagers, huh. So your decision theory requires a specific prior?

Comment author: Vladimir_Nesov 23 July 2010 10:35:24AM 0 points [-]

Omega lets me decide to take only one box after meeting Omega, when I have already updated on the fact that Omega exists, and so I have much better knowledge about which sort of god I'm likely to encounter.

In what sense can you update? Updating is about following a plan, not about deciding on a plan. You already know that it's possible to observe anything, you don't learn anything new about environment by observing any given thing. There could be a deep connection between updating and logical uncertainty that makes it a good plan to update, but it's not obvious what it is.

Comment author: EStokes 26 July 2010 11:12:37PM 1 point [-]

Huh? Updating is just about updating your map. (?) The next sentence I didn't understand the reasoning of, could you expand?

Comment author: andreas 27 July 2010 02:03:22AM 0 points [-]

Intuitively, the notion of updating a map of fixed reality makes sense, but in the context of decision-making, formalization in full generality proves elusive, even unnecessary, so far.

By making a choice, you control the truth value of certain statements—statements about your decision-making algorithm and about mathematical objects depending on your algorithm. Only some of these mathematical objects are part of the "real world". Observations affect what choices you make ("updating is about following a plan"), but you must have decided beforehand what consequences you want to establish ("[updating is] not about deciding on a plan"). You could have decided beforehand to care only about mathematical structures that are "real", but what characterizes those structures apart from the fact that you care about them?

Vladimir talks more about his crazy idea in this comment.

Comment author: TobyBartels 22 July 2010 08:09:12AM 1 point [-]

But wait! You can fix this. All you have to do is be disposed to take just Box B.

No, that's not what I should do. What I should do is make Omega think that I am disposed to take just Box B. If I can successfully make Omega think that I'll take only Box B but still take both boxes, then I should. But since Omega is superintelligent, let's take it as understood that the only way to make Omega think that I'll take only Box B is to make it so that I'll actually take Box B. Then that is what I should do.

But I have to do it now! (I don't do it now only because I don't believe that this situation will ever happen.) Once Omega has placed the boxes and left, if the known laws of physics apply, then it's too late!

If you take only Box B and get a million dollars, wouldn't you regret having not also taken Box A? Not only would you have gotten a thousand dollars more, you'd also have shown up that know-it-all superintelligent intergalactic traveller too! That's a chance that I'll never have, since Omega will read my comment here and leave my Box B empty, but you might have that chance, and if so then I hope you'll take it.

Comment author: Alicorn 22 July 2010 08:14:43AM *  2 points [-]

It's not really too late then. Omega can predict what you'll do between seeing the boxes, and choosing which to take. If this is going to include a decision to take one box, then Omega will put a million dollars in that box.

I will not regret taking only one box. It strikes me as inconsistent to regret acting as the person I most wish to be, and it seems clear that the person I most wish to be will take only one box; there is no room for approved regret.

Comment author: TobyBartels 22 July 2010 08:29:41AM *  0 points [-]

It's not really too late then.

If you say this, then you believe in backwards causality (or a breakdown of the very notion of causality, as in Kevin's comment below). I agree that if causality doesn't work, then I should take only Box B, but nothing in the problem as I understand it from the original post implies any violation of the known laws of physics.

If known physics applies, then Omega can predict all it likes, but my actions after it has placed the boxes cannot affect that prediction. There is always the chance that it predicts that I will take both boxes but I take only Box B. There is even the chance that it will predict that I will take only Box B but I take both boxes. Nothing in the problem statement rules that out. It would be different if that were actually impossible for some reason.

I will not regret taking only one box.

I knew that you wouldn't, of course, since you're a one-boxer. And we two-boxers will not regret taking both boxes, even if we find Box B empty. Better $1000 than nothing, we will think!

Comment author: Vladimir_Nesov 22 July 2010 08:39:44AM *  3 points [-]

If you say this, then you believe in backwards causality (or a breakdown of the very notion of causality, as in Kevin's comment below). I agree that if causality doesn't work, then I should take only Box B, but nothing in the problem as I understand it from the original post implies any violation of the known laws of physics.

Beware hidden inferences. Taboo causality.

Comment author: TobyBartels 22 July 2010 09:55:16AM *  1 point [-]

I don't see what that link has to do with anything in my comment thread. (I haven't read most of the other threads in reply to this post.)

I should explain what I mean by ‘causality’. I do not mean some metaphysical necessity, whereby every event (called an ‘effect’) is determined (or at least influenced in some asymmetric way) by other events (called its ‘causes’), which must be (or at least so far seem to be) prior to the effect in time, leading to infinite regress (apparently back to the Big Bang, which is somehow an exception). I do not mean anything that Aristotle knew enough physics to understand in any but the vaguest way.

I mean the flow of macroscopic entropy in a physical system.

The best reference that I know on the arrow of time is Huw Price's 1996 book Time's Arrow and Archimedes' Point. But actually I didn't understand how entropy flow leads to a physical concept of causality until several years after I read that, so that might not actually help, and I'm having no luck finding the Internet conversation that made it click for me.

But basically, I'm saying that, if known physics applies, then P(there is money in Box B|all information available on a macroscopic level when Omega placed the boxes) = P(there is money in Box B|all information … placed the boxes & I pick both boxes), even though P(I pick both boxes|all information … placed the boxes) < 1, because macroscopic entropy strictly increases between the placing of the boxes and the time that I finally pick a box.

So I need to be given evidence that known physics does not apply before I pick only Box B, and a successful record of predictions by Omega will not do that for me.

Comment author: FAWS 22 July 2010 12:04:02PM *  6 points [-]

If known physics applies, then Omega can predict all it likes, but my actions after it has placed the boxes cannot affect that prediction. There is always the chance that it predicts that I will take both boxes but I take only Box B. There is even the chance that it will predict that I will take only Box B but I take both boxes. Nothing in the problem statement rules that out. It would be different if that were actually impossible for some reason.

Ah, I see what the probem is. You have a confused notion of free will and what it means to make a choice.

Making a choice between two options doesn't mean there is a real chance that you might take either option (there always is at least an infinitesimal chance, but that it always true even for things that are not usefully described as a choice). It just means that attributing the reason for your taking whatever option you take is most usefully attributed to you (and not e.g. gravity, government, the person holding a gun to you head etc.). In the end, though, it is (unless the choice is so close that random noise makes the difference) a fact about you that you will make the choice you will make. And it is in principle possible for others to discover this fact about you.

If it is a fact about you that you will one-box it is not possible that you will two-box. If it is a fact about you that you will two-box it is not possible that you will one-box. If it is a fact about you that you will leave the choice up to chance then Omega probably doesn't offer you to take part in the first place.

Now, when deciding what choice to make it is usually most useful to pretend there is a real possibility of taking either option, since that generally causes facts about you that are more benefitial to you. And that you do that is just another fact about you, and influences the fact about which choice you make. Usually the fact which choice you will make has no consequences before you make your choice, and so you can model the rest of the world as being the same in either case up to that point when counterfactually considering the consequences of either choice. But the fact about which choice you will make is just another fact like any other, and is allowed, even if it usually doesn't, to have consequences before that point in time. If it does it is best, for the very same reason you pretend that either choice is a real possibility in the first place, to also model the rest of the world as different contingent on your choice. That doesn't mean backwards causality. Modeling the word in this way is just another fact about you that generates good outcomes.

Comment author: RobinZ 22 July 2010 11:52:45PM 4 points [-]

Alicorn:

It's not really too late then. Omega can predict what you'll do between seeing the boxes, and choosing which to take. If this is going to include a decision to take one box, then Omega will put a million dollars in that box.

TobyBartels:

If you say this, then you believe in backwards causality (or a breakdown of the very notion of causality, as in Kevin's comment below). I agree that if causality doesn't work, then I should take only Box B, but nothing in the problem as I understand it from the original post implies any violation of the known laws of physics.

I remember reading an article about someone who sincerely lacked respect for people who were 'soft' (not exact quote) on the death penalty ... before ending up on the jury of a death penalty case, and ultimately supporting life in prison instead. It is not inconceivable that a sufficiently canny analyst (e.g. Omega) could deduce that the process of being picked would motivate you to reconsider your stance. (Or, perhaps more likely, motivate a professed one-boxer like me to reconsider mine.)

Comment author: CarlShulman 23 July 2010 12:22:59AM *  2 points [-]

From Andy Egan.

The Psychopath Button: Paul is debating whether to press the ‘kill all psychopaths’ button. It would, he thinks, be much better to live in a world with no psychopaths. Unfortunately, Paul is quite confident that only a psychopath would press such a button. Paul very strongly prefers living in a world with psychopaths to dying. Should Paul press the button? (Set aside your theoretical commitments and put yourself in Paul’s situation. Would you press the button? Would you take yourself to be irrational for not doing so?)

Newcomb’s Firebomb: There are two boxes before you. Box A definitely contains $1,000,000. Box B definitely contains $1,000. You have two choices: take only box A (call this one-boxing), or take both boxes (call this two-boxing). You will signal your choice by pressing one of two buttons. There is, as usual, an uncannily reliable predictor on the scene. If the predictor has predicted that you will two-box, he has planted an incendiary bomb in box A, wired to the two-box button, so that pressing the two-box button will cause the bomb to detonate, burning up the $1,000,000. If the predictor has predicted that you will one-box, no bomb has been planted – nothing untoward will happen, whichever button you press. The predictor, again, is uncannily accurate.

I would suggest looking at your implicit choice of counterfactuals and their role in your decision theory. Standard causal decision theory involves local violations of the laws of physics (you assign probabilities to the world being such that you'll one-box, or such that you'll one-box, and then ask what miracle magically altering your decision, without any connection to your psychological dispositions, etc, would deliver the highest utility). Standard causal decision theory is a normative principle for action, that says to do the action that would deliver the most utility if a certain kind of miracle happened. But you can get different versions of causal decision theory by substituting different sorts of miracles, e.g. you can say: "if I one-box, then I have a psychology that one-boxes, and likewise for two-boxing" so you select the action such that a miracle giving you the disposition to do so earlier on would have been better. Yet another sort of counterfactual that can be hooked up to the causal decision theory framework would go "there's some mathematical fact about what decision(decisions given Everett) my brain structure leads to in standard physics, and the predictor has access to this mathematical info, so I'll select the action that would be best brought about by a miracle changing that mathematical fact".