Desrtopa comments on Welcome to Less Wrong! (July 2012) - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (843)
To be properly isomorphic to the Newcomb's problem, the chance of the predictor being wrong should approximate to zero.
If I thought that the chance of my friend's mother being wrong approximated to zero, I would of course choose to one-box. If I expected her to be an imperfect predictor who assumed I would behave as if I were in the real Newcomb's problem with a perfect predictor, then I would choose to two-box.
In Newcomb's Problem, if you choose on the basis of which choice is consistent with a higher expected return, then you would choose to one-box. You know that your choice doesn't cause the box to be filled, but given the knowledge that whether the money is in the box or not is contingent on a perfect predictor's assessment of whether or not you were likely to one-box, you should assign different probabilities to the box containing the money depending on whether you one-box or two-box. Since your own mental disposition is evidence of whether the money is in the box or not, you can behave as if the contents were determined by your choice.
Hm, I think I still don't understand the one-box perspective, then. Are you saying that if the predictor is wrong with probability p, you would take two-boxes for high p and one box for a sufficiently small p (or just for p=0)? What changes as p shrinks?
Or what if Omega/Ann's mom is a perfect predictor, but for a random 1% of the time decides to fill the boxes as if it made the opposite prediction, just to mess with you? If you one-box for p=0, you should believe that taking one box is correct (and generates $1 million more) in 99% of cases and that two boxes is correct (and generates $1000 more) in 1% of cases. So taking one box should still have a far higher expected value. But the perfect predictor who sometimes pretends to be wrong behaves exactly the same as an imperfect predictor who is wrong 1% of the time.
You choose the boxes according to the expected value of each box choice. For a 99% accurate predictor, the expected value of one-boxing is $990,000,000 (you get a billion 99% of the time, and nothing 1% of the time,) while the expected value of two-boxing is $10,001,000 (you get a thousand 99% of the time, and one billion and one thousand 1% of the time.)
The difference between this scenario and the one you posited before, where Ann's mom makes her prediction by reading your philosophy essays, is that she's presumably predicting on the basis of how she would expect you to choose if you were playing Omega. If you're playing against an agent who you know will fill the boxes according to how you would choose if you were playing Omega (we'll call it Omega-1,) then you should always two-box (if you would one-box against Omega, both boxes will contain money, so you get the contents of both. If you would two-box against Omega, only one box would contain money, and if you one-box you'll get the empty one.)
An imperfect predictor with random error is a different proposition from an imperfect predictor with nonrandom error.
Of course, if I were dealing with this dilemma in real life, my choice would be heavily influenced by considerations such as how likely it is that Ann's mom really has billions of dollars to give away.
Ok, but what if Ann's mom is right 99% of the time about how you would choose when playing her?
I agree that one-boxers make more money, with the numbers you used, but I don't think that those are the appropriate expected values to consider. Conditional on the fact that the boxes have already been filled, two-boxing has a $1000 higher expected value. If I know only one box is filled, I should take both. If I know both boxes are filled, I should take both. If I know I'm in one of those situations but not sure of which it is, I should still take both.
Another analogous situation would be that you walk into an exam, and the professor (who is a perfect or near-perfect predictor) announces that he has written down a list of people whom he has predicted will get fewer than half the questions right. If you are on that list, he will add 100 points to your score at the end. The people who get fewer than half of the questions right get higher scores, but you should still try to get questions right on the test... right? If not, does the answer change if the professor posts the list on the board?
I still think I'm missing something, since a lot of people have thought carefully about this and come to a different conclusion from me, but I'm still not sure what it is. :/
You are focusing too much on the "already have been filled", as if the particular time of your particular decision is relevant. But if your decision isn't random (and yours isn't), then any individual decision is dependent on the decision algorithm you follow -- and can be calculated in exactly the same manner, regardless of time. Therefore in a sense your decision has been made BEFORE the filling of the boxes, and can affect their contents.
You may consider it easier to wrap your head around this if you think of the boxes being filled according to what result the decision theory you currently have would return in the situation, instead of what decision you'll make in the future. That helps keep in mind that causality still travels only one direction, but that a good predictor simply knows the decision you'll make before you make it and can act accordingly.
I would one-box. I gave the relevant numbers on this in my previous comment; one-boxing has an expected value of $990,000,000 to the expected $10,001,000 if you two-box.
When you're dealing with a problem involving an effective predictor of your own mental processes (it's not necessary for such a predictor to be perfect for this reasoning to become salient, it just makes the problems simpler,) your expectation of what the predictor will do or already have done will be at least partly dependent on what you intend to do yourself. You know that either the opaque box is filled, or it is not, but the probability you assign to the box being filled depends on whether you intend to open it or not.
Let's try a somewhat different scenario. Suppose I have a time machine that allows me to travel back a day in the past. Doing so creates a stable time loop, like the time turners in Harry Potter or HPMoR (on a side note, our current models of relativity suggest that such loops are possible, if very difficult to contrive.) You're angry at me because I've insulted your hypothetical scenario, and are considering hitting me in retaliation. But you happen to know that I retaliate against people who hit me by going back in time and stealing from them, which I always get away with due to having perfect alibis (the police don't believe in my time machine.) You do not know whether I've stolen from you or not, but if I have, it's already happened. You would feel satisfied by hitting me, but it's not worth being stolen from. Do you choose to hit me or not?
If the professor is a perfect predictor, then I would deliberately get most of the problems wrong, thereby all but guaranteeing a score of over 100 points. I would have to be very confident that I would get a score below fifty even if I weren't trying to on purpose before trying to get all the questions right would give me a higher expected score than trying to get most of the questions wrong.
If the professor posts the list on the board, then of course it should affect the answer. If my name isn't on the list, then he's not going to add the 100 points to my test in any case, so my only recourse to maximizing my grade is to try my best on the test. If my name is on the list, then he's already predicted that I'm going to score below 50, so whether he's a perfect predictor or not, I should try to do well so that he's adding 100 points to as high a score as I can manage.
The difference between the scenario where he writes the names on the board and the scenario where he doesn't is that in the former, my expectations of his actions don't vary according to my own, whereas in the latter, they do.
I believe you are making a mistake. Specifically, you are implementing a decision algorithm that ensures that "you lose" is a correct self fulfilling prophecy (in fact you ensure that it is the only valid prediction he could make). I would throw the test (score in the 40s) even when my name is not on the list.
Do you also two box on Transparent Newcomb's?
If I were in a position to predict that this were the sort of thing the professor might do, then I would precommit to throwing the test should he implement such a procedure. But you could just as easily end up with the perfect predictor professor saying that in the scoring for this test, he will automatically fail anyone he predicts would throw the test in the previously described scenario. I don't think there's any point in time where making such a precommitment would have positive expected value. By the time I know it would have been useful, it's already too late.
Edit: I think I was mistaken about what problem you were referring to. If I'm understanding the question correctly, yes I would, because until the scenario actually occurs I have no reason to suspect any precommitment I make is likely to bring about more favorable results. For any precommitment I could make, the scenario could always be inverted to punish that precommitment, so I'd just do what has the highest expected utility at the time at which I'm presented with the scenario. It would be different if my probability distribution on what precommitments would be useful weren't totally flat.
As an aside, I'll note that a lot of the solutions bandied around here to decision theory problems remind me of something from Magic: The Gathering which I took notice of back when I still followed it.
When I watched my friends play, one would frequently respond to another's play with "Before you do that, I-" and use some card or ability to counter their opponent's move. The rules of MTG let you do that sort of thing, but I always thought it was pretty silly, because they did not, in fact, have any idea that it would make sense to make that play until after seeing their opponent's move. Once they see their opponent's play, they get to retroactively decide what to do "before" their opponent can do it.
In real life, we don't have that sort of privilege. If you're in a Counterfactual Mugging scenario, for instance, you might be inclined to say "I ought to be the sort of person who would pay Omega, because if the coin had come up the other way, I would be making a lot of money now, so being that sort of person would have positive expected utility for this scenario." But this is "Before you do that-" type reasoning. You could just as easily have ended up in a situation where Omega comes and tells you "I decided that if you were the sort of person who would not pay up in a Counterfactual Mugging scenario, I would give you a million dollars, but I've predicted that you would, so you get nothing."
When you come up with a solution to an Omega-type problem involving some type of precommitment, it's worth asking "would this precommitment have made sense when I was in a position of not knowing Omega existed, or having any idea what it would do even if it did exist?"
In real life, we sometimes have to make decisions dealing with agents who have some degree of predictive power with respect to our thought processes, but their motivations are generally not as arbitrary as those attributed to Omega in most hypotheticals.
Can you give a specific example of a bandied-around solution to a decision-theory problem where predictive power is necessary in order to implement that solution?
I suspect I disagree with you here -- or, rather, I agree with the general principle you've articulated, but I suspect I disagree that it's especially relevant to anything local -- but it's difficult to be sure without specifics.
With respect to the Counterfactual Mugging you reference in passing, for example, it seems enough to say "I ought to be the sort of person who would do whatever gets me positive expected utility"; I don't have to specifically commit to pay or not pay. Isn't it? But perhaps I've misunderstood the solution you're rejecting.
Well, if your decision theory tells you you ought to be the sort of person who would pay up in a Counterfactual Mugging, because that gets you positive utility, then you could end up in with Omega coming and saying "I would have given you a million dollars if your decision theory said not to pay out in a counterfactual mugging, but since you would, you don't get anything."
When you know nothing about Omega, I don't think there's any positive expected utility in choosing to be the sort of person who would have positive expected utility in a Counterfactual Mugging scenario, because you have no reason to suspect it's more likely than the inverted scenario where being that sort of person will get you negative utility. The probability distribution is flat, so the utilities cancel out.
Say Omega comes to you with a Counterfactual Mugging on Day 1. On Day 0, would you want to be the sort of person who pays out in a Counterfactual Mugging? No, because the probabilities of it being useful or harmful cancel out. On Day 1, when given the dilemma, do you want to be the sort of person who pays out in a Counterfactual Mugging? No, because now it only costs you money and you get nothing out of it.
So there's no point in time where deciding "I should be the sort of person who pays out in a Counterfactual Mugging" has positive expected utility.
Reasoning this way means, of course, that you don't get the money in a situation where Omega would only pay you if it predicted you would pay up, but you do get the money in situations where Omega pays out only if you wouldn't pay out. The latter possibility seems less salient from the "before you do that-" standpoint of a person contemplating a Counterfactual Mugging, but there's no reason to assign it a lower probability before the fact. The best you can do is choose according to whatever has the highest expected utility at any given time.
Omega could also come and tell me "I decided that I would steal all your money if you hit the S key on your keyboard between 10:00-11:00 am on a Sunday, and you just did," but I don't let this influence my typing habits. You don't want to alter your decision theories or general behavior in advance of specific events that are no more probable than their inversions.
You don't need a precommitment to make the correct choice. You just make it. That does happen to include one boxing on Transparent Newcomb's (and conventional Newcomb's, for the same reason). The 'but what if someone punishes me for being the kind of person who makes this choice' is a fully general excuse to not make rational choices. The reason why it is an invalid fully general excuse is because every scenario that can be contrived to result in 'bad for you' is one in which your rewards are determined by your behavior in an entirely different game to the one in question.
For example your "inverted Transparent Newcomb's" gives you a bad outcome, but not because of your choice. It isn't anything to do with a decision because you don't get to make one. It is punishing you for your behavior in a completely different game.
Could you describe the Transparent Newcomb's problem to me so I'm sure we're on the same page?
"What if I face a scenario that punishes me for being the sort of person who makes this choice?" is not a fully general counterargument, it only applies in cases where the expected utilities of the scenarios cancel out.
If you're the sort of person who won't honor promises made under duress, and other people are sufficiently effective judges to recognize this, then you avoid people placing you under duress to extract promises from you. But supposing you're captured by enemies in a war, and they say "We could let you go if you made some promises to help out our cause when you were free, but since we can't trust you to keep them, we're going to keep you locked up and torture you to make your country want to ransom you more."
This doesn't make the expected utilities of "Keep promises made under duress" vs. "Do not keep promises made under duress" cancel out, because you have an abundance of information with respect to how relatively likely these situations are.
Take a suitable description of Newcomb's problem (you know, with Omega and boxes). Then make the boxes transparent. That is the extent of the difference. I assert that being able to see the money makes no difference to whether one should one box or two box (and also that one should one box).
No, I don't, since you have a time-turner. (To be clear, non-hypothetical-me wouldn't hit non-hypothetical-you either.) I would also one-box if I thought that Omega's predictive power was evidence that it might have a time turner or some other way of affecting the past. I still don't think that's relevant when there's no reverse causality.
Back to Newcomb's problem: Say that brown-haired people almost always one-box, and people with other hair colors almost always two-box. Omega predicts on the basis of hair color: both boxes are filled iff you have brown hair. I'd two-box, even though I have brown hair. It would be logically inconsistent for me to find that one of the boxes is empty, since everyone with brown hair has both boxes filled. But this could be true of any attribute Omega uses to predict.
I agree that changing my decision conveys information about what is in the boxes and changes my guess of what is in the boxes... but doesn't change the boxes.
If the agent filling the boxes follows a consistent, predictable pattern you're outside of, you can certainly use that information to do this. In Newcomb's Problem though, Omega follows a consistent, predictable pattern you're inside of. It's logically inconsistent for you to two box and find they both contain money, or pick one box and find it's empty.
Why is whether your decision actually changes the boxes important to you? If you know that picking one box will result in your receiving a million dollars, and picking two boxes will result in getting a thousand dollars, do you have any concern that overrides making the choice that you expect to make you more money?
A decision process of "at all times, do whatever I expect to have the best results" will, at worst, reduce to exactly the same behavior as "at all times, do whatever I think will have a causal relationship with the best results." In some cases, such as Newcomb's problem, it has better results. What do you think the concern with causality actually does for you?
We don't always agree here on what decision theories get the best results (as you can see by observing the offshoot of this conversation between Wedrifid and myself,) but what we do generally agree on here is that the quality of decision theories is determined by their results. If you argue yourself into a decision theory that doesn't serve you well, you've only managed to shoot yourself in the foot.
In the absence of my decision affecting the boxes, taking one box and leaving $1000 on the table still looks like shooting myself in the foot. (Of course if I had the ability to precommit to one-box I would -- so, okay, if Omega ever asks me this I will take one box. But if Omega asked me to make a decision after filling the boxes and before I'd made a precommitment... still two boxes.)
I think I'm going to back out of this discussion until I understand decision theory a bit better.
Feel free. You can revisit this conversation any time you feel like it. Discussion threads never really die here, there's no community norm against replying to comments long after they're posted.