Omega's subcontracting to Alpha
This is a variant built on Gary Drescher's xor problem for timeless decision theory.
You get an envelope from your good friend Alpha, and are about to open it, when Omega appears in a puff of logic.
Being completely trustworthy as usual (don't you just hate that?), he explains that Alpha flipped a coin (or looked at the parity of a sufficiently high digit of pi), to decide whether to put £1000 000 in your envelope, or put nothing.
He, Omega, knows what Alpha decided, has also predicted your own actions, and you know these facts. He hands you a £10 note and says:
"(I predicted that you will refuse this £10) if and only if (there is £1000 000 in Alpha's envelope)."
What to do?
EDIT: to clarify, Alpha will send you the envelope anyway, and Omega may choose to appear or not appear as he and his logic deem fit. Nor is Omega stating a mathematical theorem: that one can deduce from the first premise the truth of the second. He is using XNOR, but using 'if and only if' seems a more understandable formulation. You get to keep the envelope whatever happens, in case that wasn't clear.
Loading…
Subscribe to RSS Feed
= f037147d6e6c911a85753b9abdedda8d)
Comments (90)
I would translate this scenario into the following world-program:
Based on this world-program, it is obvious that you should refuse the note.
Heh. I first read "1e6" above as a function determining whether the user-agent is internet explorer 6.
What is "CONTRADICTION" supposed to do in this "program"?
This program must be embedded in a larger one, since the original problem description didn't say what Omega would do if it couldn't truthfully make the prediction it did. Call that larger program U2(S). The only thing we are told about U2 is that it only calls U if it can do so in a way which guarantees that U won't reach a contradiction. Suppose, for example, that if Omega's prediction couldn't be made truthfully then you wouldn't get any money at all. This corresponds to the world program:
Note that there are plenty of mathematically equivalent ways to write this - for example, using a would_throw(U,S,RNG) function.
You didn't take into account that Omega appears conditionally on contents of the envelope and your decision.
Isn't this just a reformulation of Newcomb's problem ?
Mechanically, "Omega + alpha + the random generator" is equivalent to Newcomb's Omega.
[Edit: OK, it isn't :)]
The difference is that Alpha is generating the contents of the envelope independently on your decision, while in Newcomb's problem Omega is placing money in the box under the direct (acausal) control of your decision.
I guess you're right - especially considering the level of disagreement in the other comments.
The fact that I can't exactly pin down at what point I disagree with those who say they take the £10 indicates that I don't understand the problem enough (I may not understand enough about Newcomb's problem either).
I don't understand the theory, but the one-boxing solution seems obvious: given that Omega is correct, if I am such that I would refuse the £10, I would not be offered the choice unless the £1 000 000 is in the envelope, therefore I should refuse the £10 ...
... unless I believe Omega is over u(£1 000 000)/u(£10) times more likely to offer the deal to agents who take the £10 than to agents who refuse. In that case, being willing to take the £10 is expected to pay off.
Edit (after timtyler's reply): Vladimir Nesov's analysis has caused me to reconsider - I would now take the £10.
That seems like a reasonable analysis to me - assuming that you get to keep the contents of the envelope.
So: the solution depends on information about Omega's motivation not included in the problem description. Time to consult those mythology textbooks, methinks - so we have appropriate priors :-/
[edit: scratch this - I get it now!]
How are we to read Omega's statement?
Or:
The former interpretation leaves open the possibility that, if there is £1000 000 in the envelope, Omega made no prediction one way or the other.
Let's see...
This seems natural way to do it. However, if you're the type that refuses, Omega can't be making this deal when you didn't receive £M. Also, if you accept, Omega can't be making this deal if you really won. However, there really isn't anything that prevents that a) <You'd accept> and <Letter is full> and b) <You'd refuse> and <Letter is empty> from being true, because your choice cannot determine the outcome of the cointoss Alpha made. Thus, you should accept.
This would be weird. For Omega to make a claim like this, your choice has to be somehow connected to the outcome of the coinflip Alpha made before sending you the envelope. This is because Omega is making the prediction conditional only to the outcome of the coin toss. Your choice is simply assumed to be entangled with that.
I one-box on Newcomb's. I two-envelope on this. This situation, however, is absurd. [ETA: Now that I think about it more, I'm now inclined to one-envelope and also more irritated by the hidden assumptions in this whole hypothetical.]
Omega's prediction is bizarre, because there's no apparent way that the contents of the envelope are entangled with my decision to accept the money - whether I am the kind of person who two-boxes or one-boxes, the contents of the envelope were decided by a coin toss. It seems the only way for Omega to make a reliable prediction would be to predict my response to Omega's deal, and then phrase the deal such that my predicted response is tied to the actual contents of the envelope. That is, Omega knows the envelope is empty, and he knows I will accept the offer, so he says "reject offer iff envelope is full."
In other words, I don't actually believe Omega can reliably make this same prediction, as he could theoretically do in Newcomb's; this hypothetical is absurd. If you had a thousand people, roughly half of them would have opened envelopes full of money (or whatever percentage would emerge from Alpha's random generator). It seems inconceivable that those half must also have rejected the ten pound note, and that the other half must have accepted it. If I saw a large enough sample illustrating this effect occurring consistently, I'd have to throw up my hands in confusion and reject the ten-pound note, but this outcome is outlandishly unlikely.
Indeed, but it could be different enough to count as an "exercise" to those interested in doing the causal analysis for themselves.
ETA: responded to an earlier version of the comment in which Psychohistorian claimed this was the same as Newcomb
On further analysis, I actually think it is totally different; I believe you responded to an earlier draft of my previous comment in which I said they were basically the same. Lest people get confused.
(replying to new version of comment) Yes, Omega could easily only offer the deal to those for whom his prediction is true.
Am I right that if the money is in the envelope Omega only offers to one-boxers and if the money is not in the envelope Omega only offers to two-boxers?
That's the way I read it. (after some analysis)
I'd refuse the £10 unless I was extremely confident (>99.999%) that if I took the £10 I couldn't actually exist because the scenario as given was inconsistent and that the real me would end up with £10 more if this was true.
(i. e. the offer was independent of my decision but the prediction not, Omega would take exactly the same action regardless of whether the envelope was filled, the prediction would be false if I took the £10 in either case, and I would take the £10 in either case)
I humbly request that future thought experiments not be done in £, since there is no "£" key on my keyboard.
Option-3 types £ for me.
You and your made-up keys
Opt+3 for me as well (Mac); Alt+0163 for Windows.
If I was going to do that much, I'd just copy-and-paste from the article. It's still terribly inconvenient.
I'd suggest using the common three letter abbreviations for currencies (as found here for example), it avoids confusion between the various currencies known as dollars and avoids problems with symbol unavailability. For £s that would be GBP.
All I have to say is: ¥, €, ৳, ₪, ریال and zł
Right, those neither. How about utilon sandwiches?
I assume the problem is to be interpreted as Omega saying, "Either (1) (I have predicted you will refuse the $10, and there is $1000,000 in the envelope) xor (2) (I have predicted you will take the $10, and there is $0 in the envelope)", rather than asserting some sort of entanglement above and beyond this.
If so, I take the $10 and formulate the counterfactual, "If I were the sort of person who rejected the $10, Omega would have told me something else to begin with, like 'if you refuse the $10 then the envelope will be empty', but the digit of pi would have been the same".
As previously noted, though, I can't quite say how to compute this formally.
I assume you would consider "You will take this $10 if and only if Barack Obama is president of the United States." true even if you were completely certain you would take the $10 if John McCain was President. If and only if this was the intended meaning I would agree with your conclusion.
Re: "If I were the sort of person who rejected the $10, Omega would have told me something else to begin with"
...but why would he do that? Is there some assumption about Omega's motivation here?
It's correct if we expand it to "Omega would have told me something else or not shown up to begin with", or if we're assuming that Omega will show up and say something. It would have to say something like "if you refuse the $10 then the envelope will be empty" — or some other true thing, not the statement given in the original post — since we're assuming it's a perfect predictor and is being honest.
Omega can say:
"I have predicted you will refuse the $10, and there is $1000,000 in the envelope".
There is absolutely no problem with that - if you are a refuser (as specified in the hypothetical) and if the envelope does indeed contain $1000,000.
True, he would have to say something else, in the case where the envelope is empty.
Ah, yes, you're right.
Now I'm not sure if I was correctly interpreting Eliezer's point or just restating my own.
Take the £10, and don't bother opening the envelope. You are not (acausally) controlling whether £1'000'000 are in the envelope, but are controlling whether to take the £10, so you'll take the £10 (since you are money-maximizing), and if Omega is correct, the envelope is going to be empty.
The agents that refuse the £10 in this situation will only be visited by Omega when the envelope contains the £1'000'000, while the money-maximizing agents will only be visited by Omega when the envelope is empty. By your decision, you don't control whether the envelop contains money, but you do control whether Omega appears (since the statement asserted by Omega is about you). Thus, by deciding to take the money in this situation, you add expected £5 (or however often Omega appears) to your balance, by acausally summoning Omega.
By refusing the £10, you maximize the amount of money that the agents who see Omega get, by moving Omega around. It's similar to trying to become a lottery winner by selling to existing lottery winners the same dietary supplement you take, since this makes the takers of this dietary supplement more likely to be lottery winners.
I give formalization of this solution in another comment.
Re: "The agents that refuse the £10 in this situation will only be visited by Omega when the envelope contains the £1'000'000"
That sounds good to me!
Not good. All you've achieved is redirected Omega to situations in which you don't take its money. It's better to have Omega where you do take its money, it's free money.
If there was a 50% chance Omega in the future visits someone who would refuse to take the £10 and gives them £1'000'000, and a 50% chance Omega visits someone who would accept the £10 and gives them £10 and an empty envelope, what would you prefer? Depending how you would behave if Omega visited you the probability of either the first or the second person being you is zero.
Refuse obviously. You've described how my choice controls the payoff, which is not the case with Alpha.
Would you still get the envelope if Omega wasn't going to visit you? I had automatically assumed that Omega initiated the whole situation because the title said that Omega was subcontracting, but I see that the body doesn't actually state that.
Yes, this seems to be assumed, though it didn't actually happen this way, Omega did visit you.
If that's the scenario and if the only method Omega uses to ensure its prediction is accurate is selective visits your conclusion is obviously correct. I doubt there is anyone here who (correctly?) understood it that way and disagrees.
The main problem with such thought experiments is understanding them correctly (or better, having your formal decision theory represent them correctly), from where the conclusion usually follows trivially. Just try convincing a game theorist to cooperate in Prisoner's dilemma, even experimental observations contradicting the theory of rational defection won't help.
Edited to make this clear
For some reason, this reply specifically cemented the argument for me. Thank you - I now agree with you.
Edit: If it helps, my confusion was the appearance of causation from I-refuse-the-£10 to I-receive-the-£1e6. When you made this comment, I mentally went back and saw that the fraction of possible worlds in which Alpha gives me the million is unchanged by Omega's prediction, and therefore that I can take the tenner without affecting it.
Right - and finally I am there as well :-)
Huh? Omega is there and says that if and only if you refuse will there be £1000 000 in the envelope. Aren't you turning down £1000 000 for £10?
Nope. I find my explanation pretty clear, can you point to what in particular you don't follow?
I haven't worked through your formalization, but I do know that if I refuse, I get the £1000000! So I think something must be wrong with your implementation of the concept "money-maximizing".
This doesn't clarify the problem you are having.
But you're the one having the problem! :-) ... I think. Omega, always right, says: "I predicted that you will refuse this £10 if and only if there is £1000 000 in Alpha's envelope." So refusing the £10 is my only chance at the £1000000, and I actually have the envelope where the £1000000 may be. Unless it spontaneously combusts, or someone snatches it away, the larger sum should be mine.
Omega only appears conditionally on at least the statement it asserts being correct. By taking/not taking its offer, you are only controlling the conditions under which Omega appears, and not contents of the envelope. By refusing the £10, you make sure that Omega appears only when the envelope is full (but you don't make the envelope full, though it's going to be full given that you've made this decision), and by accepting the £10, you make sure that Omega appears only when the envelope is empty.
It's admittedly confusing that you can (acausally) control the conditions under which Omega appears (when the envelope is full/empty), when Omega remains right in front of you during the decision-making (this is analogous to controlling the contents of the big box in Newcomb's problem) but at the same time, you don't control the contents of the envelope.
So you refuse the £10?
No, I don't. Why?
I'm sorry - I was confused when I wrote that comment.
And by assuming you are a certain sort of agent (which you incorrectly call money-maximizing), you set those conditions to your own disadvantage! An agent which just flips a coin to decide whether to accept or refuse the £10 will have a bigger expected payoff than you. So surely a rational entity can do better.
You are setting the conditions for appearance of Omega. The best conditions for Omega to appear are those where you take its money, since it's good for nothing else.
By refusing the £10, you maximize the amount of money that the agents who see Omega get, by moving Omega around. It's similar to trying to become a lottery winner by selling to existing lottery winners the same dietary supplement you take, since this makes the takers of this dietary supplement more likely to be lottery winners.
(Added this paragraph to the top-level comment.)
I'm not 100% sure but it seems like you and Jonii are calculating correctly. It's just ironic that if the situation as described happens to you, it means you were unlucky and there's no money in Alpha's envelope, whereas if it happens to someone like me, it means I was lucky and the £1000000 is there.
Your choice doesn't change what's inside the envelope. Not even a-causally. Your choice only affects whether or not Omega comes and offers you £10 or not, and you maximize your expected value there by being the kinda guy who takes £10 that's offered. That way the 50% of time Alpha doesn't send you £1 000 000, you get £10. Otherwise those 50% time you wouldn't get anything.
But with Vladimir's assumptions, he gets the £1000000 zero percent of the time! I quote:
The description "money-maximizing" is wrong, but he is talking about a type of agent which does indeed make it impossible for Omega to ever show up while the £1000000 is there.
To return to your own comment,
correct
wrong!
You're missing the fact that Alpha sending a letter happened regardless of Omega, and thus regardless of what you choose, you'd get £1 000 000 from Alpha 50% of time. You can't choose so that you'd get £1 000 000 zero percent of the time simply because your choice doesn't affect that.
I repeat that, since that seems to be the key problem here. Alpha flipped a coin to decide whether or not to send you £1 000 000. Your past or future actions don't have any control over Alpha doing this, and sending you £1 000 000. In particular, your actions, upon receiving the envelope don't have any, direct or indirect, entanglement with what does the envelope contain.
Your actions however are entangled with whether or not Omega comes along to offer you £10. If you're the kinda guy to accept the £10, Omega makes this deal only when Alpha didn't sent you £1 000 000. If you're the kinda guy that refuses £10, Omega comes only when Alpha sent you £1 000 000.
So to maximize the expected value, you should accept the £10. That way, you get 50% time £1 000 000 and 50% £10. Otherwise you get 50% time £1 000 000 and 50% time £0
Vladimir (and you!) get £1000000 zero percent of the time on those occasions when Omega appears, and by hypothesis this is one of those occasions! You are committing a higher-order version of the two-box mistake.
What Eliezer and Vladimir said (though if anyone's counting, I decided this before looking at the comments). My choice controls whether or not Omega made its prediction, not the contents of the envelope. (How would one express this using a world-program?)
Yep, this seems correct.
For some reason, Vladimir's formulation seems clearer to me. Must be my math background.
This is formalization of the decision procedure corresponding to the informal solution I gave in another comment (obviously, it includes a lot of detail unnecessary for this problem, but for the purpose of demonstrating the method, details are not omitted):
Programs for the participants:
P - player
O - Omega deciding whether to make the offer
A - Alpha
Notation: [[X]] is the output of program X, X(Y) is a program that is composition of X and Y, where X expects program Y as argument. Thus, [[X(Y)]] is the output of X given argument Y, and X([[Y]]) is the output of X given output of Y (but not Y).
[[A]] is the contents of the envelope (true/false, or 1/0), [[O(P,A)]] is Omega's decision to make the appearance, [[P(O)]] is player's decision.
Problem statement: Omega appeared ([[O(P,A)]] is true), it asserts that either envelope is full, or player takes its offer ([[A]] xor [[P(O)]] is true), P is money-maximizing, what is [[P(O)]]?
First, P(O) (the player that has observed Omega appearing, as distinguished from P that might or might not see Omega; note that Omega is parameterized with whole P, not just P(O)) constructs the expression for its payoff that depends on P(O). It's going to get the contents of the envelope if [[A]], and Omega's money if [[O(P,A)]], where P can be represented as {X,[[P(O)]]}, where X stands for the rest of P, while [[P(O)]] is specifically P's decision in this situation where P observed Omega.
The payoff is
V([[P(O)]])=10^6*[[A]]+10*[[P(O)]]*[[O({X,[[P(O)]]},A)]],
the payoff function is
V(t)=10^6*[[A]]+10*t*[[O({X,t},A)]].
This may be taken as part of problem statement, additionally specifying P(O) via a statement that P(O) has a property of maximizing V([[P(O)]]).
P(O) considers counterfactual actions for the role of [[P(O)]] (the counterfactuals are not assumed to be equal to [[P(O)]]; I use constants T=true and F=false to refer to them). That is, it's computing
[[P(O)]]:=arg max V(t)
First, consider T=true (take Omega's offer). The payoff is
V(T)=10^6*[[A]]+10*T*[[O({X,T},A)]]=10^6*[[A]]+10*[[O({X,T},A)]].
Second, consider F=false (refuse Omega's offer). The payoff is
V(F)=10^6*[[A]]+10*F*[[O({X,F},A)]]=10^6*[[A]].
Clearly, even given that we don't know what O(P,A) is, V(T)>=V(F). Therefore, [[P(O)]]=T (the player takes Omega's offer). Since [[A]] xor [[P(O)]], it follows that [[A]]=false, so there is no point in opening the envelope.
I really like this formulation.
I think this problem would be clearer with a smaller ratio between the two payments. As it is the risk that you might have misunderstood the problem or made an unwarranted assumption dominates and you should not take the £10 just to be safe you aren't making a big mistake, even if you think that's a losing move.
The large ratio is deliberate (and it's not so huge that 'all my theories are wrong!' is going to dominate).
The problem as stated is easy to misunderstand. I personally misunderstood (or "under-understood") it in at least three separate ways: 1. I considered the causal relation between Omega visiting me making that particular prediction and Alpha choosing me as potential receipant an unknown. 2. I considered what sort of predictions Omega would make in various counterfactuals an unknown. 3. I considered the truth value of "I predicted that you will refuse this £10 if and only if there is £1000 000 in Alpha's envelope." conditional on me always accepting the money if given a chance and the envelope being empty an unknown.
Even now that my current understanding seems to have have been indirectly confirmed by you my confidence that this understanding is correct is only about 0.95. Even if you were to confirm that I currently understand it correctly in a more direct way I doubt it would raise my confidence above 0.999. Unless the scenario was presented in a way that raised my confidence significantly higher (for example Omega stating: "this situation is in all relevant ways identical to how you eventually came to understand the "Omega's subcontracting to Alpha" scenario presented by Stuart_Armstrong) I'd still refuse the £10.
Alpha has sent me the envelope, and would do so whatever Omega decided to do. The causal decision as to why Omega visited me is irrelevant.
This is irrelevant.
"I predicted that you will refuse this £10 if and only if there is £1000 000 in Alpha's envelope." is true. To avoid ambiguity, recast is as: XNOR("I predicted you will refuse this £10", "there is £1000 000 in Alpha's envelope") is true.
As for the large ratio:
Omega snatches the £10 away from you, swallows his words, runs out and returns a bit later with a check for £100 000. "Out of deference to your uncertainties", he says, sighing, "I've decided to renew the experiment with a lesser ratio. But just this once!"
No, it's not. If, conditional on me always rejecting the £10 when Omega makes this specific prediction, Omega would visit when the envelope was empty, offer £10 and make the different prediction that I'd take it (the assumption being that I wouldn't refuse it without reason so Omega can't make the true prediction that I'd do so), or if, conditional on me always taking the £10 when Omega makes this specific prediction, Omega would visit when the envelope was full, offer £10 and make the different prediction that I'd take it that would change the payoff. If only the first was true that would make the scenarios equivalent.
I take it of course.
Hmm. Some commentators appear to be assuming that you don't get to keep the contents of the envelope which Alpha sent you. The problem is not 100% clear on this issue - and it makes a difference to the answer!
As it says "You get an envelope from your good friend Alpha," I'd assume by default that you get to keep it, unless there were an explicit statement that Omega might steal it from you under some circumstance.
I'll disregard my earlier comment and assume the latter interpretation for now.
So here are the things that can (and can't) happen:
So, starting with Alpha's coin flip, here are the only possible paths:
Unlike in Newcomb's problem, in this case Omega's prediction is irrelevant to what Alpha actually did. The envelope either contains £1 000 000 or nothing, and you're going to receive it as-is, no matter what Omega says about your future actions. If the envelope has £1 000 000 in it, and you're the sort of person who would accept the £10, then Omega will not offer you this conundrum, because it couldn't honestly state the prediction as given — it would just leave you alone with your new riches. Same if the envelope is empty and you are the sort of person who would reject the £10. Your strategy affects nothing other than whether Omega will show up in the first place.
Conclusion: be the sort of person who would accept the £10. It won't affect whether you'll receive the envelope or what you find in it, and if it is empty, at least you'll get that £10 as a consolation prize.
(And now I'll read the rest of the thread to see if smarter people agree with me.)
This is really cool puzzle. By accepting the £10, you're in a conditional "Alpha never sent you the money", but by refusing you're in conditional "Alpha sent you the money". However, that choice doesn't actually affect Alpha sending or not sending you the money. This is unlike the Newcomb's problem, where you can truly choose, acausally, what the opaque box will contain.
What gets me is the peculiarly elaborate pitfall into which I, at least, fell.
Suppose you said: "Invent a thought-experiment which could trick people who know to one-box in the classic Newcomb's paradox, into thinking that here was a higher-order analogue; the source of the error to be, that people who reason wrongly do experience a higher payoff in this case."
Perhaps it should be called Armstrong's trap. But did he make it by design, or did he just fall into it first?
It's all built on Drescher's version, just stripped down.
And I didn't fall into Drescher's trap: I incorrectly stated the correct answer, then thought about it really hard and really long, and correctly stated the correct answer.
Refuse the 10 pounds.
The assumptions that you'll move Omega around or otherwise alter Omega's pattern of behavior seems speculative. Maybe Omega's going fishing for a few hundred years. Maybe she's feeling frisky and generous. Maybe I got the problem wrong.
It appears there's some chance that I'm improving my chance at a million pounds by some amount. Those "somes" may not be high, but my problem-uncertainty makes it an easy call. I see no reason to expect a lower or higher number of Omega appearances based on my decision. To the extent this might be true, it's dwarfed by the additional chance at a million pounds.
Once I have my million pounds, I'm getting a restraining order against Omega. He's always trying to screw with me.
--JRM
The answer is dependong on what Omega would have done if he had predicted that you will refuse the 10 iff there is nothing in Alpha's envelope. Two possibilities :
Omega1 would have brought you the envelope anyway, but said nothing else
Omega2 wouldn't have bothered to come, since there's no paradox involved.
When dealing with Omega1, take the £10, yay, free money ! (there wasn't anything in the envelope anyway, otherwise Omega wouldn't have visited you, the taker-of-free-money - see Vladimir's explanation)
The post as stated doesn't tell us which Omega we're dealing with, so I would have to guess. I'd say Omega2 (so I wouldn't take the coin), but any information about Omega may switch that the other way.
When dealing with Omega2, don't take the £10, 'cause Omega2 doesn't visit takers-of-free-money when the envelope is full, and you want omega to be visiting you!
Omega didn't bring you the envelope. It arrived before he got there.
I like this problem because it seems to operate on the same intuitions that lead to one-boxing and two-boxing for those who don't do any actual analysis, but the one-boxing intuition leads you astray (though not by much).
Personally, I'd take the £10 on reflection but would have refused the £10 based on my intuitions. I'm pretty sure Omega wouldn't be giving me £10, since if confronted with the situation I would be forced to think, "If I say 'no' now, there's lots of money in that envelope."
Take the £10, my reasoning goes as follows: if I precommit to refuse it, either I get the £1,000,000 and refuse £10, or I get £0 and omega doesn't even show up; if I precommit to accept it, either I get the £1,000,000 and omega doesn't even show up, or I get £10 from omega showing up and me accepting (the respective expected utilities being £500,000 and £500,005). I do better by precommitting to take it, so to be reflectively consistent (and win), I must now take it.