Extremely Counterfactual Mugging or: the gist of Transparent Newcomb

Bongo

LESSWRONG
LW

10 Extremely Counterfactual Mugging or: the gist of Transparent Newcomb

9th Feb 2011

1 min read

10

Omega will either award you $1000 or ask you to pay him $100. He will award you $1000 if he predicts you would pay him if he asked. He will ask you to pay him $100 if he predicts you wouldn't pay him if he asked.

Omega asks you to pay him $100. Do you pay?

This problem is roughly isomorphic to the branch of Transparent Newcomb (version 1, version 2) where box B is empty, but it's simpler.

Here's a diagram:

Counterfactual MuggingCounterfactualsNewcomb's Problem

Personal Blog

10

Mentioned in

12Counterfactuals for Perfect Predictors

Extremely Counterfactual Mugging or: the gist of Transparent Newcomb

New Comment

79 comments, sorted by

top scoring

Click to highlight new comments since: Today at 4:25 AM

Some comments are truncated due to high volume. (⌘F to expand all)Change truncation settings

[-]cousin_it14y70

I have sympathy for the commenters who agreed to pay outright (Nesov and ata), but viewed purely logically, this problem is underdetermined, kinda like Transparent Newcomb's (thx Manfred). This is a subtle point, bear with me.

Let's assume you precommit to not pay if asked. Now take an Omega that strictly follows the rules of the problem, but also has one additional axiom: I will award the player $1000 no matter what. This Omega can easily prove that the world in which it asks you to pay is logically inconsistent, and then it concludes that in that world y... (read more)

5JGWeissman14y

This seems to be confusing "counterfactual::if" with "logical::if". Noting that a world is impossible because the agents will not make the decisions that lead to that world does not mean that you can just make stuff up about that world since "anything is true about a world that doesn't exist".

2cousin_it14y

Your objection would be valid if we had a formalized concept of "counterfactual if" distinct from "logical if", but we don't. When looking at the behavior of deterministic programs, I have no idea how to make counterfactual statements that aren't logical statements.

9Vladimir_Nesov14y

When a program takes explicit input, you can look at what the program does if you pass this or that input, even if some inputs will in fact never be passed.

0Vladimir_Nesov14y

If event S is empty, then for any Q you make up, it's true that [for all s in S, Q]. This statement also holds if S was defined to be empty if [Not Q], or if Q follows from S being non-empty.

5JGWeissman14y

Yes you can make logical deductions of that form, but my point was that you can't feed those conlusions back into the decision making process without invalidating the assumptions that went into those conclusions.

5Bongo14y

* I will award the player $1000 iff the player would pay * I will award the player $1000 no matter what How are these consistent??

1cousin_it14y

Both these statements are true, so I'd say they are consistent :-) In particular, the first one is true because "The player would pay if asked" is true. "The player would pay if asked" is true because "The player will be asked" is false and implies anything. "The player will be asked" is false by the extra axiom. Note I'm using ordinary propositional logic here, not some sort of weird "counterfactual logic" that people have in mind and which isn't formalizable anyway. Hence the lack of distinction between "will" and "would".

2Bongo14y

Are you sure you're not confusing the propositions o=ASK => a=PAY and a=PAY ? If not, could you present your argument formally?

0cousin_it14y

I thought your post asked about the proposition "o=ASK => a=PAY", and didn't mention the other one at all. You asked this: not this: So I just don't use the naked proposition "a=PAY" anywhere. In fact I don't even understand how to define its truth value for all agents, because it may so happen that the agent gets $1000 and walks away without being asked anything.

2Bongo14y

Seems to me that for all agents there is a fact of the matter about whether they would pay if asked. Even for agents that never in fact are asked. So I do interpret a=PAY as "would pay". But maybe there are other legitimate interpretations.

0cousin_it14y

If both the agent and Omega are deterministic programs, and the agent is never in fact asked, that fact may be converted into a statement about natural numbers. So what you just said is equivalent to this: I don't know, this looks shady.

0AlephNeil14y

Why? Say the world program W includes function f, and it's provable that W could never call f with argument 1. That doesn't mean there's no fact of the matter about what happens when f(1) is computed (though of course it might not halt). (Function f doesn't have to be called from W.) Even if f can be regarded as a rational agent who 'knows' the source code of W, the worst that could happen is that f 'deduces' a contradiction and goes insane. That's different from the agent itself being in an inconsistent state. Analogy: We can define the partial derivatives of a Lagrangian with respect to q and q-dot, even though it doesn't make sense for q and q-dot to vary independently of each other.

1wedrifid14y

I assume that you would not consider this to be a problem if Omega was replaced with a 99% reliable predictor. Confirm?

0cousin_it14y

...Huh? My version of Omega doesn't bother predicting the agent, so you gain nothing by crippling its prediction abilities :-) ETA: maybe it makes sense to let Omega have a "trembling hand", so it doesn't always do what it resolved to do. In this case I don't know if the problem stays or goes away. Properly interpreting "counterfactual evidence" seems to be tricky.

0wedrifid14y

I would consider an Omega that didn't bother predicting in even that case to be 'broken'. Omega is good when it comes to good faith natural language implementation. Perhaps I would consider it one of Omega's many siblings, one that requires more formal shackles.

1Vladimir_Nesov14y

This takes the decision out of Omega's hands and collapses Omega's agent-provability by letting it know its decision. We already know that in ADT-style decision-making, all theories of consequences of actions other than the actual one are inconsistent, that they are merely agent-consistent, and adding an axiom specifying which action is actual won't disturb consistency of the theory of consequences of the actual action. But there's no guarantee that Omega's decision procedure would behave nicely when faced with knowledge of inconsistency. For example, instead of concluding that you do agree to pay, it could just as well conclude that you don't, which would be a moral argument to not award you the $1000, and then Omega just goes crazy. One isn't meant to know own decisions, bad for sanity.

0cousin_it14y

Yes, you got it right. I love your use of the word "collapse" :-) My argument seems to indicate that there's no easy way for UDT agents to solve such situations, because the problem statements really are incomplete. Do you see any way to fix that, e.g. in Parfit's Hitchhiker? Because this is quite disconcerting. Eliezer thought he'd solved that one.

2Vladimir_Nesov14y

I don't understand your argument. You've just broken Omega for some reason (by letting it know something true which it's not meant to know at that point), and as a result it fails in its role in the thought experiment. Don't break Omega.

0cousin_it14y

My implementation of Omega isn't broken and doesn't fail. Could you show precisely where it fails? As far as I can see, all the conditions in Bongo's post still hold for it, therefore all possible logical implications of Bongo's post should hold for it too, and so should all possible "solutions".

2Vladimir_Nesov14y

It doesn't implement the counterfactual where depending on what response the agent assumes to give on observing a request to pay, it can agent-consistently conclude that Omega will either award or not award $1000. Even if we don't require that Omega is a decision-theoretic agent with known architecture, the decision problem must make the intended sense. In more detail. Agent's decision is a strategy that specifies, for each possible observation (we have two: Omega rewards it, or Omega asks for money), a response. If Omega gives a reward, there is no response, and if it asks for money, there are two responses. So overall, we have two strategies to consider. The agent should be able to contemplate the consequences of adopting each of these strategies, without running into inconsistencies (observation is an external parameter, so even if in a given environment, there is no agent-with-that-observation, decision algorithm can still specify a response to that observation, it would just completely fail to control the outcome). Now, take your Omega implementation, and consider the strategy of not paying from agent's perspective. What would the agent conclude about expected utility? By problem specification, it should (in the external sense, that is not necessarily according to its own decision theory, if that decision theory happens to fail this particular thought experiment) conclude that Omega doesn't give it an award. But your Omega does knowably (agent-provably) give it an award, hence it doesn't play the intended role, doesn't implement the thought experiment.

0wedrifid14y

I think it would be fair to say that cousin_it's (ha! Take that English grammar!) description of Omega's behaviour does fit the problem specification we have given but certainly doesn't match the problem we intended. That leaves us to fix the wording without making it look too obfuscated. Taking another look at the actual problem specification it actually doesn't look all that bad. The translation into logical propositions didn't really do it justice. We have... cousin_it allows "if" to resolve to "iif", but translates "The player would pay if asked" into A -> B; !B therefore 'whatever'. Which is not quite what we mean when we use the phrase in English. We are trying to refer to the predicted outcome in a "possibly counterfactual but possibly real" reality. Can you think of a way to say what we mean without any ambiguity and without changing the problem itself too much?

0cousin_it14y

I believe you haven't yet realized the extent of the damage :-) It's very unclear to me what it means for Omega to "implement the counterfactual" in situations where it gives the agent information about which way the counterfactual came out. After all, the agent knows its own source code A and Omega's source code O. What sense does it make to inquire about the agent's actions in the "possible world" where it's passed a value of O(A) different from its true value? That "possible world" is logically inconsistent! And unlike the situation where the agent is reasoning about its own actions, in our case the inconsistency is actually exploitable. If a counterfactual version of A is told outright that O(A)==1, and yet sees a provable way to make O(A)==2, how do you justify not going crazy? The alternative is to let the agent tacitly assume that it does not necessarily receive the true value of O(A), i.e. that the causality has been surgically tweaked at some point - so the agent ought to respond to any values of O(A) mechanically by using a "strategy", while taking care not to think too much about where they came from and what they mean. But: a) this doesn't seem to accord with the spirit of Bongo's original problem, which explicitly asked "you're told this statement about yourself, now what do you do?"; b) this idea is not present in UDT yet, and I guess you will have many unexpected problems making it work.

4Vladimir_Nesov14y

By the way, this bears an interesting similarity to the question of how would you explain the event of your left arm being replaced by a blue tentacle. The answer that you wouldn't is perfectly reasonable, since you don't need to be able to adequately respond to that observation, you can self-improve in a way that has a side effect of making you crazy once you observe your left arm being transformed into a blue tentacle, and that wouldn't matter, since this event is of sufficiently low measure and has sufficiently insignificant contribution to overall expected utility to not be worth worrying about. So in our case, the question should be, is it desirable to not go crazy when presented with this observation and respond in some other way instead, perhaps to win the Omega Award? If so, how should you think about the situation?

1Vladimir_Nesov14y

It's not the correct way of interpreting observations, you shouldn't let observations drive you crazy. Here, we have A's action-definition that is given in factorized form: action=A(O("A")). Normally, you'd treat such decompositions as explicit dependence bias, and try substituting everything in before starting to reason about what would happen if. But if O("A") is an observation, then you're not deciding action, that is A(O("A")). Instead, you're deciding just A(-), an Observations -> Actions map. So being told that you've observed "no award" doesn't mean that you now know that O("A")="no award". It just means that you're the subagent responsible for deciding a response to parameter "no award" in the strategy for A(-). You might also want to acausally coordinate with the subagent that is deciding the other part of that same strategy, a response to "award". And this all holds even if the agent knows what O("A") means, it would just be a bad idea to not include O("A") as part of the agent in that case, and so optimize the overall A(O("A")) instead of the smaller A(-).

0cousin_it14y

At this point it seems we're arguing over how to better formalize the original problem. The post asked what you should reply to Omega. Your reformulation asks what counterfactual-you should reply to counterfactual-Omega that doesn't even have to say the same thing as the original Omega, and whose judgment of you came from the counterfactual void rather than from looking at you. I'm not sure this constitutes a fair translation. Some of the commenters here (e.g. prase) seem to intuitively lean toward my interpretation - I agree it's not UDT-like, but think it might turn out useful.

2Vladimir_Nesov14y

It's more about making more explicit the question of what are observations, and what are boundaries of the agent (Which parts of the past lightcone are part of you? Just the cells in the brain? Why is that?), in deterministic decision problems. These were never explicitly considered before in the context of UDT. The problem statement states that something is "observation", but we lack a technical counterpart of that notion. Your questions resulted from treating something that's said to be an "observation" as epistemically relevant, writing knowledge about state of the territory which shouldn't be logically transparent right into agent's mind. (Observations, possible worlds, etc. will very likely be the topic of my next post on ADT, once I resolve the mystery of observational knowledge to my satisfaction.)

2cousin_it14y

Thanks, this looks like a fair summary (though a couple levels too abstract for my liking, as usual). A note on epistemic relevance. Long ago, when we were just starting to discuss Newcomblike problems, the preamble usually went something like this: "Omega appears and somehow convinces you that it's trustworthy". So I'm supposed to listen to Omega's words and somehow split them into an "epistemically relevant" part and an "observation" part, which should never mix? This sounds very shady. I hope we can disentangle this someday.

0Vladimir_Nesov14y

Yes. If the agent doesn't know what Omega actually says, this can be an important consideration (decisions are made by considering agent-provable properties of counterfactuals, all of which except the actual one are inconsistent, but not agent-inconsistent). If Omega's decision is known (and not just observed), it just means that counterfactual-you's response to counterfactual-Omega doesn't control utility and could well be anything. But at this point I'm not sure in what sense anything can actually be logically known, and not in some sense just observed.

0wedrifid14y

Now that is a real concern!

0wedrifid14y

I am another person who pays outright. While I acknowledge the "could even reward defectors" logical difficulty I am also comfortable asserting that not paying is an outright wrong choice. A payoff of "$1,000" is to be preferred to a payoff of "either $1,000 or $0". It would seem to merely require more precise wording in the problem statement. At the crudest level you simply add the clause "if it is logically coherent to so refrain Omega will not give you $1,000".

-1[anonymous]12y

The solution has nothing to do with hacking the counterfactual; the reflectively consistent (and winning) move is to pay the $100, as precommitting to do so nets you a guaranteed $1000 (unless omega can be wrong). It is true that "The player will pay iff asked" implies "The player will not be asked" and therefore "The player will not pay", but this does not cause omega to predict the player to not pay when asked.

-1Stuart_Armstrong14y

You've added an extra axiom to Omega, noted that this resulted in a consistent result, and concluded that therefore the original axioms are incomplete (because the result is changed). But that does not follow. This would only be true if the axiom was added secretly, and the result was still consistent. But because I know about this extra axiom, you've changed the problem; I behave differently, so the whole setup is different. Or consider a variant: I have the numbers sqrt[2], e and pi. I am required to output the first number that I can prove is irrational, using the shortest proof I can find. This will be sqrt[2] (or maybe e), but not pi. Now add the axiom "pi is irrational". Now I will output pi first, as the proof is one line long. This does not mean that the original axiomatic system was incorrect or under-specified...

1cousin_it14y

I'm not completely sure what your comment means. The result hasn't "changed", it has appeared. Without the extra axiom there's not enough axioms to nail down a single result (and even with it I had to resort to lexicographic chance at one point). That's what incompleteness means here. If you think that's wrong, try to prove the "correct" result, e.g. that any agent who precommits to not paying won't get the $1000, using only the original axioms and nothing else. Once you write out the proof, we will know for certain that one of us is wrong or the original axioms are inconsistent, which would be even better :-)

0Vladimir_Nesov14y

I was also previously suspicious to the word "change", but lately made my peace with it. Saying that there's change is just a way of comparing objects of the same category. So if you look at an apple and a grape, what changes from apple to grape is, for example, color. A change is simultaneously what's different, and a method of producing one from the other. Application of change to time, or to the process of decision-making, are mere special cases. Particular ways of parsing change in descriptions of decision problems can be incorrect because of explicit dependence bias: those changes as methods of determining one from the other are not ambient dependencies. But other usages of "change" still apply. For example, your decision to take one box in Newcomb's instead of two changes the content of the box.

[-]nshepperd14y7-4

This is also isomorphic to the absent-minded driver problem with different utilities (and mixed strategies*), it seems. Specifically, if you consider the abstract idealized decision theory you implement to be "you", you make the same decision in two places, once in omega's brain while he predicts you and again if he asks you to pay up. Therefore the graph can be transformed from this

into this

which looks awfully like the absent minded driver. Interesting.

Additionally, modifying the utilities involved ($1000 -> death; swap -$100 and $0) gives Pa... (read more)

[-]SilasBarta14y100

I contend it's also isomorphic to the very real-world problems of hazing, abuse cycles, and akrasia.

The common dynamic across all these problems is that "You could have been in a winning or losing branch, but you've learned that you're in a losing branch, and your decision to scrape out a little more utility within that branch takes away more utility from (symmetric) versions of yourself in (potentially) winning branches."

3WrongBot14y

Disagree. In e.g. the case of hazing, the person who has hazed me is not a counterfactual me, and his decision is not sufficiently correlated with my own for this approach to apply.

1SilasBarta14y

Whether it's a counterfactual you is less important than whether it's a symmetric version of you with the same incentives and preferences. And the level of correlation is not independent of whether you believe there's a correlation (like on Newcomb's problem and PD).

0WrongBot14y

Incentives, preferences, and decision procedure. Mine are not likely to be highly correlated with a random hazer's.

0SilasBarta14y

Yes, depending on the situation, there may be in intractable discorrelation as you move from the idealization to real-world hazing. But keep in mind, even if the agents actually were fully correlated (as specified in my phrasing of the Hazing Problem), they could still condemn themselves to perpetual hazing as a result of using a decision theory that returns a different output depending what branch you have learned you are in, and it is this failure that you want to avoid. There's a difference between believing that a particular correlation is poor, vs. believing that only outcomes within the current period matter for your decision. (Side note: this relates to the discussion of the CDT blind spot on page 51 of EY's TDT paper.)

2TheOtherDave14y

This is very nicely put.

0Matt_Simpson14y

Does this depend on many worlds as talking about "branches" seems to suggest? Consider, e.g.

4Bongo14y

No. These branches correspond to the branches in diagrams.

0Matt_Simpson14y

Ah, i see. That makes much more sense. Thanks.

2SilasBarta14y

The problems are set up as one-shot so you can't appeal to a future chance of (yourself experiencing) losing that is caused by this decision. By design, the problems probe your theory of identity and what you should count as relevant for purposes of decision-making. Also, what Bongo said.

0Bongo14y

In the Hitchhiker you're scraping in the winning branch though.

0SilasBarta14y

True, I didn't mean the isomorphism to include that problem, but rather, just the ones I mentioned plus counterfactual mugging and (if I understand the referent correctly) the transparent box newcomb's. Sorry if I wasn't clear.

3Bongo14y

Sort of. The shape is old, the payoffs are new. If Parfit's Hitchhiker, you pay for not being counterfactually cast into the left branch. In Extremely Counterfactual Mugging, you pay for counterfactually gaining access to the left branch.

[-]Vladimir_Nesov14y10

By paying, you reduce probability of the low-utility situation you're experiencing, and correspondingly increase the probability of the counterfactual with Omega Award, thus increasing overall expected utility. Reality is so much worse than its alternatives that you're willing to pay to make it less real.

[-]ata14y10

Of course I'd pay.

2NihilCredo14y

Downvoted for the obnoxiousness of saying "of course" and not giving even the vaguest of explanations. This comment read to me like this: "It's stupid that you would even ask such a question, and I can't be bothered to say why it's stupid, but I can be bothered to proclaim my superior intelligence".

2ata14y

I'm sorry it came off that way, I just found it overly similar to the other various Newcomblike problems, and couldn't see how it was supposed to reveal anything new about optimal decision strategies; paying is the TDT answer and the UDT answer; it's the choice you'd wish you could have precommitted to if you could have precommitted, it's the decision-type that will cause you to actually get $1,000,000, etc. If I'm not mistaken, this problem doesn't address any decision situations not already covered by standard Counterfactual Mugging and Parfit's Hitchhiker. (Admittedly, I should have just said all that in the first place.)

[-]Perplexed14y00

'a' should use a randomizing device so that he pays 51% of the time and refuses 49% of the time. Omega, aware of this strategy, but presumably unable to hack the randomizing device, achieves the best score by predicting 'pay' 100% of the time.

I am making an assumption here about Omega's cost function - i.e. that Type 1 and Type 2 errors are equally undesirable. So, I agree with cousin_it that the problem is underspecified.

The constraint P(o=AWARD) = P(a=PAY) that appears in the diagram does not seem to match the problem statement. It is also ambiguous. ... (read more)

1wedrifid14y

Apply any of the standard fine print for Omega based conterfactuals with respect for people who try to game the system with randomization. Depending on the version that means a payoff of $0, a payoff of 0.51 * $1,000 or an outright punishment for being a nuisance.

0Bongo14y

I prefer this interpretation: P(a=X) means how sure the agent is it will X. If it flips a coin do decide whether X or Y, P(a=X)=P(a=Y)~=0.5. If it's chosen to "just X", P(a=X) ~= 1. Omega for his part knows the agent's surety and uses a randomizing device to match his actions with it. ETA: if interpreted naively, this leads to Omega rewarding agents with deluded beliefs about what they're going to do. Maybe Omega shouldn't look at the agent's surety but the surety of "a perfectly rational agent" in the same situation. I don't have a real solution to this right now.

[-]Nisan14y00

Nice diagram. By the way, the assertion "Omega asks you to pay him $100" doesn't make sense unless your decision is required to be a mixed strategy. I.e., P(a = PAY) < 1. In fact, P(a = PAY) must be significantly less than the strength of your beliefs about Omega.

[-]prase14y00

Of course I don't pay. Omega has predicted that I won't pay if he asked, and Omega's predictions are by definition correct. I don't see how this is a decision problem at all.

2NihilCredo14y

Omega problems do not (normally) require Omega to be assumed a perfect predictor, just a sufficiently good one.

0prase14y

Well, fine, but then the correct strategy depends on Omega's success rate (and the payoffs). If the reward given to those willing to pay is r, and Omega's demand is d, and Omega's prediction success rate is s, the expected payoff for those who agree to pay is s r + (s - 1) d, which may be both positive or negative. (Refusers trivially get 0.)

1Lightwave14y

What if the person being asked for the $100 is a simulation of you which Omega is using to check whether you'll pay if he asked you? You won't know whether you're the simulation or not.

3prase14y

To predict, Omega doesn't need to simulate. You can predict that water will boil when put on fire without simulating the movement of 10^23 molecules. Omega even can't use simulation to arrive at his prediction in this scenario. If Omega demands money from simulated agents who then agree to pay, the simulation violates the formulation of the problem, according to which Omega should reward those agents. If the problem is reformulated as "Omega demands payment only if the agent would counterfactually disagree to pay, OR in a simulation", then we have a completely different problem. For example, if the agent is sufficiently confident about his own decision algorithm, then after Omega's demand he could assign high probability to being in a simulation. The analysis would be more complicated there. In short, I am only saying that 1. Omega is trustworthy. 2. Omega can predict the agents behaviour with certainty. 3. Omega tells that it demands money only from agents whom it predicted to reject the demand. 4. Omega demands the money. 5. The agent pays. are together incompatible statements.

0Skatche14y

True but irrelevant. In order to make an accurate prediction, Omega needs, at the very least, to simulate my decision-making faculty in all significant aspects. If my decision-making process decides to recall some particular memory, then Omega needs to simulate that memory in all significant aspects. If my decision-making process decides to wander around the room conducting physics experiments, just to be a jackass, and to peg my decision to the results of those experiments - well, then Omega will need to convincingly simulate the results of those experiments. The anticipated experience will be identical for my actual decision-making process as for my simulated decision-making process. Mind you, based on what I know of the brain, I think you'd actually need to run a pretty convincing, if somewhat coarse-grained, simulation of a good chunk of my light cone in order to predict my decision with any kind of certainty, but I'm being charitable here. And yes, this seems to render the original formulation of the problem paradoxical. I'm trying to think of ways to suitably reformulate it without altering the decision theoretics, but I'm not sure it's possible.

0Nornagest14y

I'm not convinced that all that actually follows from the premises. One of the features of Newcomblike problems is that they tend to appear intuitively obvious to the people exposed to them, which suggests rather strongly to me that the intuitive answer is linked to hidden variables in personality or experience, and in most cases isn't sensitively dependent on initial conditions. People don't always choose the intuitive answer, of course, but augmenting that with information about the decision-theoretic literature you've been exposed to, any contrarian tendencies you might have, etc. seems like it might be sufficient to achieve fine-grained predictive power without actually running a full simulation of you. The better the predictive power, of course, the more powerful the model of your decision-making process has to be, but Omega doesn't actually have to have perfect predictive power for Newcomblike conditions to hold. It doesn't even have to have particularly good predictive power, given the size of the payoff.

0Skatche14y

Er, I think we're talking about two different formulations of the problem (both of which are floating around on this page, so this isn't too surprising). In the original post, the constraint is given by P(o=award)=P(a=pay), rather than P(o=award)=qP(a=pay)+(1-q)P(a=refuse), which implies that Omega's prediction is nearly infallible, as it usually is in problems starring Omega: any deviation from P(o=award)=0 or 1 will be due to "truly random" influences on my decision (e.g. quantum coin tosses). Also, I think the question is not "what are your intuitions?" but "what is the optimal decision for a rationalist in these circumstances?" You seem to be suggesting that most of what determines my decision to pay or refuse could be boiled down to a few factors. I think the evidence weighs heavily against this: effect sizes in psychological studies tend to be very weak. Evidence also suggests that these kinds of cognitive processes are indeed sensitively dependent on initial conditions. Differences in the way questions are phrased, and what you've had on your mind lately, can have a significant impact, just to name a couple of examples.

0TheOtherDave14y

Doesn't that contradict the original assertion? That is, at that point it sounds like it's no longer "Omega will ask me to pay him $100 if he predicts I wouldn't pay him if he asked," it's "Omega will ask me to pay him $100 if he predicts I wouldn't pay him if he asked OR if I'm a simulation." Not that any of this is necessary. Willingness to pay Omega depends on having arbitrarily high confidence in his predictions; it's not clear that I could ever arrive at such a high level of confidence, but it doesn't matter. We're just asking, if it's true, what decision on my part maximizes expected results for entities for which I wish to maximize expected results? Perhaps I could never actually be expected to make that decision because I could never be expected to have sufficient confidence that it's true. That changes nothing. Also worth noting that if I pay him $100 in this scenario I ought to update my confidence level sharply downward. That is, if I've previously seen N predictions and Omega has been successful in each of them, I have now seen (N+1) predictions and he's been successful in N of them. Of course, by then I've already paid; there's no longer a choice to make. (Presumably I should believe, prior to agreeing, that my choosing to pay him will not actually result in my paying him, or something like that... I ought not expect to pay him, given that he's offered me $100, regardless of what choice I make. In which case I might as well choose to pay him. This is absurd, of course, but the whole situation is absurd.) ETA - I am apparently confused on more fundamental levels than I had previously understood, not least of which is what is being presumed about Omega in these cases. Apparently I am not presumed to be as confident of Omega's predictions as I'd thought, which makes the rest of this comment fairly irrelevant. Oops.

5Bongo14y

No. Paying is the winning strategy in the version where the predictor is correct only with probability, say, 0.8, too. ie. P(o=AWARD) = 0.8*P(a=PAY)+0.2*P(a=REFUSE)

3TheOtherDave14y

(blink) You're right; I'm wrong. Clearly I haven't actually been thinking carefully about the problem. Thanks.

1wedrifid14y

And that is where most people make their mistake when encountering this kind of decision problem. It varies somewhat between people whether that error kicks in at standard Newcomb's, Transparent Newcomb's or some other even more abstract variant such as this.

0prase14y

Could you explain the error, rather than just say that it is a common error? How can I agree to pay in a situation which happens only if I was predicted to disagree? (I don't object to having precommited to agree to pay if Omega makes his request; that would indeed be a correct decision. But then, of course, Omega doesn't appear. The given formulation implies that Omega really appears, which logically excludes the variant of paying. Maybe it is only a matter of formulation.)

-1TheOtherDave14y

I'll echo prase's request. It seems to me that given that he's made the offer and I am confident of his predictions, I ought not expect to pay him. This is true regardless of what decision I make: if I decide to pay him, I ought to expect to fail. Perhaps I'm only carrying counterfeit bills, or perhaps a windstorm will come up and blow the money out of my hands, or perhaps by wallet has already been stolen, or perhaps I'm about to have a heart attack, or whatever. Implausible as these things are, they are far more plausible than Omega being wrong. The last thing I should consider likely is that, having decided to pay, I actually will pay. ETA - I am apparently confused on more fundamental levels than I had previously understood, not least of which is what is being presumed about Omega in these cases. Apparently I am not presumed to be as confident of Omega's predictions as I'd thought, which makes the rest of this comment fairly irrelevant. Oops.

0wedrifid14y

You just described the reasoning you would go through when making a decision. That would seem to be answer enough to demonstrate that this is a decision problem.

0TheOtherDave14y

Interesting. It seems to me that if my reasoning tells me that no matter what decision I make, the same thing happens, that isn't evidence that I have a decision problem. But perhaps I just don't understand what a decision problem is.

[-]Manfred14y-10

Would you agree that, given that Omega asks you, you are guaranteed by the rules of the problem to not pay him?

If you are inclined to take the (I would say) useless way out and claim it could be a simulation, consider the case where Omega makes sure the Omega in its simulation is also always right - creating an infinite tower of recursion such that the density of Omega being wrong in all simulations is 0.

0AlephNeil14y

Leaving open the question of whether Omega must work by simulating the Player, I don't understand why you say this is a 'useless way out'. So for now let's suppose Omega does simulate the Player. Why would Omega choose to, or need to, ensure that in its simulation, the data received by the Player equals Omega's actual output? There must be an answer to the question of what the Player would do if asked, by a being that it believes is Omega, to pay $100. Even if (as cousin_it may argue) the answer is "go insane after deducing a contradiction", and then perhaps fail to halt. To get around the issue of not halting, we can either stipulate that if the Player doesn't halt after a given length of time then it refuses to pay by default, or else that Omega is an oracle machine which can determine whether the Player halts (and interprets not halting as refusal to pay). Having done the calculation, Omega acts accordingly. None of this requires Omega to simulate itself.

0Manfred14y

It's "useless" in part because, as you note, it assumes Omega works by simulating the player. But mostly it's just that it subverts the whole point of the problem; Omega is supposed to have your complete trust in its infallibility. To say "maybe it's not real" goes directly against that. The situation in which Omega simulates itself is merely a way of restoring the original intent of infallibility. This problem is tricky; since the decision-type "pay" is associated with higher rewards, you should pay, but if you are a person Omega asks to pay, you will not pay, as a simple matter of fact. So the wording of the question has to be careful - there is a distinction between counterfactual and reality - some of the people Omega counterfactually asks will pay, none of the people Omega really asks will successfully pay. Therefore what might be seen as mere grammatical structure has a huge impact on the answer - "If asked, would you pay?" vs. "Given that Omega has asked you, will you pay?"

0wedrifid14y

Or, if you are thinking about it more precisely, it observes that however Omega works, it will be equivalent to Omega simulating the player. It just gives us something our intuitions can grasp at a little easier.

0Manfred14y

That's a fairly good argument - simulation or something equivalent is the most realistic thing to expect. But since Omega is already several kinds of impossible, if Omega didn't work in a way equivalent to simulating the player it would add minimally to the suspended disbelief. Heck, it might make it easier to believe, depending on the picture - "The impossible often has a kind of integrity to it which the merely improbable lacks."

0wedrifid14y

On the other hand sometimes the impossible is simply incomprehensible and the brain doesn't even understand what 'believing' it would mean. (Which is what my brain is doing here.) Perhaps this is because it is related behind the scenes to certain brands of 'anthropic' reasoning that I tend to reject.

Moderation Log

79Comments