DanielLC comments on Open Thread, Apr. 20 - Apr. 26, 2015 - Less Wrong Discussion
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (350)
I've come up with an interesting thought experiment I call oracle mugging.
An oracle comes up to you and tells you that either you will give them a thousand dollars or you will die in the next week. They refuse to tell you which. They have done this many times, and everyone has either given them money or died. The oracle isn't threatening you. They just go around and find people who will either give them money or die in the near future, and tell them that.
Should you pay the oracle? Why or why not?
I wouldn't pay. Let's convert it to a mundane psychological experiment, by replacing precognition with precommitment (which is the right approach according to UDT):
1) Ten participants sign up for the experiment.
2) One participant is randomly chosen to be the "loser". We know who the "loser" is, but don't tell the participants.
3) Also, each participant tells us in private whether they are a "payer" or "non-payer".
4) Each "payer" who is not a "loser" pays $10 (this corresponds to paying the oracle and staying alive). The "loser" pays $100 (this corresponds to dying). Everyone else pays nothing.
It seems obvious that you should choose to be a "non-payer", right?
In terms of the original problem, if you're the kind of person who would pay the oracle if you were approached, you're causing the oracle to approach you, so you're paying for nothing.
I don't think that it's specified in the OP that the oracle considers it likely that you will pay or indeed approaches people based on their likelihood to pay.
but it is! it really depends on how many levels of "If I know that the oracle knows that I know" you want to go into. Because if the oracle is able to factor in your decision to pay or not in whether they tell you that you should pay then thats a super-duper-oracle.
Also paying and dying is permissable and not great either.
So, as in most such problems, there's an important difference between the epistemological question ("should I pay, given what I know?") and the more fundamental question ("should I pay, supposing this description is accurate?"). Between expected value and actual value, in other words.
It's easy to get those confused, and my intuitions about one muddy my thinking about the other, so I like to think about them separately.
WRT the epistemological question, that's hard to answer without a lot of information about how likely I consider accurate oracular ability, how confident I am that the examples of accurate prediction I'm aware of are a representative sample, etc. etc. etc., all of which I think is both uncontroversial and uninteresting. Vaguely approximating all of that stuff I conclude that I shouldn't pay the oracle, because I'm not justified in being more confident that the situation really is as the oracle describes it, than that the oracle is misrepresenting the situation in some important way. My expected value of this deal in the real world is negative.
WRT the fundamental question... of course, you leave a lot of details unspecified, but I don't want to fight the hypothetical here, so I'm assuming that the "overall jist" of your description applies: I'm paying $1K for QALYs I would not have had access to without the oracle's offer. That's a good deal for me; I'm inclined to take it. (Though I might try to negotiate the price down.)
The knock-on effect is that I encourage the oracle to keep making this offer... but that's good too; I want the oracle to keep making the offer. QALYs for everyone!
So, yes, I should pay the oracle, though I should also implement decision procedures that will lead me to not pay the oracle.
I think a key part of the question, as I see it, is to formalize the difference between treatment effects and selection effects (in the context where your actions might reflect a selection effect, and we can't make the normally reasonable assumption that our actions result in treatment effects). An oracle could look into the future, find a list of people who will die in the next week, and a list of people who would pay them $1000 if presented with this prompt, and present the prompt to the exclusive or of those two lists. This doesn't give anyone QALYs they wouldn't have had otherwise.
And so I find my intuitions are guided mostly by the identification of the prompter as an "oracle" instead of a "wizard" or "witch." Oracle implies selection effect; wizard or witch implies treatment effect.
Leaving aside lexical questions about the connotations of the word "oracle", I certainly agree that if the entity's accuracy represents a selection effect, then my reasoning doesn't hold.
Indeed, I at least intended to say as much explicitly (_"I don't want to fight the hypothetical here, so I'm assuming that the "overall jist" of your description applies: I'm paying $1K for QALYs I would not have had access to without the oracle's offer." _ ) in my comment.
That said, it's entirely possible that I misread what the point of DanielLC's hypothetical was.
DanielLC said:
I interpreted that as a selection effect, so my answer recommended not paying. Now I realize that it may not be entirely a selection effect. Maybe the oracle is also finding people whose life would be saved by making them $1000 poorer, for various exotic reasons. But if the probability of that is small enough, my answer stays the same.
Right. Your reading is entirely sensible, and more likely in "the real world" (by which I mean something not-well-thought-through about how it's easier to implement the original description as a selection effect), I merely chose to bypass that reading and go with what I suspected (perhaps incorrectly) the OP actually had in mind.
It's just a version of the Newcomb's problem with negative outcomes instead of positive.
Presumably the oracle makes its offer only to people from two classes: (1) Those who will die next week AND will not pay $1000; and (2) Those who will pay $1000 AND not die next week. Since it's the oracle it can identify these people and make its offer only to them. If you got this offer, you are in one of the above classes but you "don't know" in which.
Clearly you give them money, since otherwise you are almost certain to die. It's just one-boxing in disguise.
Pay iff you would pay $1000 to avoid learning of your death the last week of your life. If you don't pay the oracle only shows up when you are about to die anyway.
Variation on this:
An oracle comes up to you and tells you that you will give it a thousand dollars. This oracle has done this many times and every time it has told people this the people have given the oracle a thousand dollars. This oracle, like the other one, isn''t threatening you. It just goes around finding people who will give it money. Should you give the oracle money?
I believe in testing rules and breaking things. So no. Don't give and see what happens.
Under UDT: pay iff you need human contact so much that you'd spend $1000 to be visited by a weird oracle who goes around posing strange decision theory dilemmas.
No, but you will.
Every decision theory I throw at it says either don't pay or Error: Divide By Zero. Is this a trick question?
I don't know what "error: divide by zero" means in this context. Could you please clarify? (If you're suggesting that the problem is ill-posed under some decision theories because the question assumes that it is possible to make a choice but the oracle's ability to predict you means you cannot really choose, how doesn't that apply to the original problem?)
You want to figure out whether to do as the oracle asks or not. To do this, you would like to predict what will happen in each case. But you have no evidence concerning the case where you don't do as it asks, because so far everyone has obliged. So, e.g., Pr(something good happens | decline oracle's request) has Pr(decline oracle's request) in the denominator, and that's zero.
I think you can say something similar about the original problem. P(decline oracle's request) can (for the new problem) also be phrased as P(oracle is wrong). And P(oracle is wrong) is zero in both problems; there's no evidence in either the original problem or the new problem concerning the case where the oracle is wrong.
Of course, the usual Newcomb arguments apply about why you shouldn't consider the case where the oracle is wrong, but they don't distinguish the problems.
That's a forward-looking probability and is certainly not zero.
In the absence of evidence you just fall back on your prior.
In order to get Error: Divide By Zero, you have to be using a particular kind of decision theory and assume P(decline oracle's request) = 0.
Your prior for what?
For the baseline, "underlying" probability of the oracle's request being declined. Roughly speaking, if you have never seen X happen, it does not mean that X will never happen (=has a probability of zero).
This assumes you're a passive observer, by the way -- if you are actively making a decision whether to accept or decline the request you can't apply Bayesian probabilities to your own actions.
I really want to say that you should pay. Obviously you should precommit to not paying if you can, and then the oracle will never visit you to begin with unless you are about to die anyway. But if you can't do that, and the oracle shows up at your door, you have a choice to pay and live or not pay and die.
Again, obviously it's better to not pay and then you never end up in this situation in the first place. But when it actually happens and you have to sit down and choose between paying it to go away or dying, I would choose to pay it.
It's all well and good to say that some decision theory results in optimal outcomes. It's another to actually implement it in yourself. To make sure every counter factual version of yourself makes the globally optimal choice, even if there is a huge cost to some of them.
The traditional LW solution to this is that you precommit once and for all to this: Whenever I find myself in a situation where I wish that I had committed to acting in accordance with a rule R I will act in accordance with R.
That's great to say, but much harder to actually do.
For example, if Omega pays $1,000 to people or asks them to commit suicide. But it only asks people it knows100% will not do it, otherwise it gives them the money.
The best strategy is to precommit to suicide if Omega asks. But if Omega does ask, I doubt most lesswrongers would actually go through with it.
So the standard formulation of a Newcomb-like paradox continues to work if you assume that Omega has a merely 99% accuracy.
Your formulation, however, doesn't work that way. If you precommit to suicide when Omega asks, but Omega is sometimes wrong, then you commit suicide with 1% probability (in exchange for having $990 expected winnings). If you don't precommit, then with a 1% chance you might get $1000 for free. In most cases, the second option is better.
Thus, the suicide strategy requires very strong faith in Omega, which is hard to imagine in practice. Even if Omega actually is infallible, it's hard to imagine evidence extraordinary enough to convince us that Omega is sufficiently infallible.
(I think I am willing to bite the suicide bullet as long as we're clear that I would require truly extraordinary evidence.)
Please Don't Fight the Hypothetical. I agree with you if you are only 99% sure, but the premise is that you know Omega is right with certainty. Obviously that is implausible, but so is the entire situation with an omniscient being asking people to commit suicide, or oracles that can predict if you will die.
But if you like you can have a lesser cost, like Omega asking you to pay $10,000. Or some amount of money significant enough to seriously consider just giving away.
I did say what I would do, given the premise that I know Omega is right with certainty. Perhaps I was insufficiently clear about this?
I am not trying to fight the hypothetical, I am trying to explain why one's intuition cannot resist fighting it. This makes the answer I give seem unintuitive.
This is essentially just another version of the smoking lesion problem, in that there is no connection, causal or otherwise, beween the thing you care about and the action you take. Your decision theory has no specific effect on your likelyhood of dying, that being determined entirely by environmental factors that do not even attempt to predict you. All you are paying for is to determine whether or not you get a visit from the oracle.
ETA: Here's a UDT game tree (see here for an explanation of the format) of this problem, under the assumption that oracle visits everyone meeting his criteria, and uses exclusive-or:
ETA2: More explanation: the colours are states of knowledge. Blue = oracle asks for money, Orange = they leave you alone. Let's say the odds of being healthy are α. If you Pay the expected reward is
α(-1000) + (1-α) DEATH; if you Don't Pay the expected reward isα 0 + (1-α) DEATH. Clearly (under UDT) paying is worse by a term of-1000α.