## Simulating Problems

1 30 January 2013 01:14PM

Apologies for the rather mathematical nature of this post, but it seems to have some implications for topics relevant to LW. Prior to posting I looked for literature on this but was unable to find any; pointers would be appreciated.

In short, my question is: How can we prove that any simulation of a problem really simulates the problem?

I want to demonstrate that this is not as obvious as it may seem by using the example of Newcomb's Problem. The issue here is of course Omega's omniscience. If we construct a simulation with the rules (payoffs) of Newcomb, an Omega that is always right, and an interface for the agent to interact with the simulation, will that be enough?

Let's say we simulate Omega's prediction by a coin toss and repeat the simulation (without payoffs) until the coin toss matches the agent's decision. This seems to adhere to all specifications of Newcomb and is (if the coin toss is hidden) in fact indistinguishable from it from the agent's perspective. However, if the agent knows how the simulation works, a CDT agent will one-box, while it is assumed that the same agent would two-box in 'real' Newcomb. Not telling the agent how the simulation works is never a solution, so this simulation appears to not actually simulate Newcomb.

Pointing out differences is of course far easier than proving that none exist. Assuming there's a problem we have no idea which decisions agents would make, and we want to build a real-world simulation to find out exactly that. How can we prove that this simulation really simulates the problem?

(Edit: Apparently it wasn't apparent that this is about problems in terms of game theory and decision theory. Newcomb, Prisoner's Dilemma, Iterated Prisoner's Dilemma, Monty Hall, Sleeping Beauty, Two Envelopes, that sort of stuff. Should be clear now.)

## A solvable Newcomb-like problem - part 3 of 3

3 06 December 2012 01:06PM

This is the third part of a three post sequence on a problem that is similar to Newcomb's problem but is posed in terms of probabilities and limited knowledge.

Part 1 - stating the problem
Part 2 - some mathematics
Part 3 - towards a solution

In many situations we can say "For practical purposes a probability of 0.9999999999999999999 is close enough to 1 that for the sake of simplicity I shall treat it as being 1, without that simplification altering my choices."

However, there are some situations where the distinction does significantly alter that character of a situation so, when one is studying a new situation and one is not sure yet which of those two categories the situations falls into, the cautious approach is to re-frame the probability as being (1 - δ) where δ is small (eg 10 to the power of -12), and then examine the characteristics of the behaviour as δ tends towards 0.

LessWrong wiki describes Omega as a super-powerful AI analogous to Laplace's demon, who knows the precise location and momentum of every atom in the universe, limited only by the laws of physics (so, if time travel isn't possible and some of our current thoughts on Quantum Mechanics are correct, then Omega's knowledge of the future is probabilistic, being limited by uncertainty).

For the purposes of Newcomb's problem, and the rationality of Fred's decisions, it doesn't matter how close to that level of power Omega actually is.   What matters, in terms of rationality, is the evidence available to Fred about how close Omega is to having to that level of power; or, more precisely, the evidence available to Fred relevant to Fred making predictions about Omega's performance in this particular game.

Since this is a key factor in Fred's decision, we ought to be cautious.  Rather than specify when setting up the problem that Fred knows with a certainty of 1 that Omega does have that power, it is better to specify a concrete level of evidence that would lead Fred to assign a probability of (1 - δ) to Omega having that power, then examine the effect upon which option to the box problem it is rational for Fred to pick, as δ tends towards 0.

The Newcomb-like problem stated in part 1 of this sequence contains an Omega that it is rational for Fred to assign a less than unity probability of being able to perfectly predict Fred's choices.  By using bets as analogies to the sort of evidence Fred might have available to him, we create an explicit variable that we can then manipulate to alter the precise probability Fred assigns to Omega's abilities.

The other nice feature of the Newcomb-like problem given in part 1, is that it is explicitly solvable using the mathematics given in part 2.  By making randomness an external feature (the device Fred brings with him) rather than purely a feature of Fred's internal mind, we can acknowledge the question of Omega being able to predict quantum events, capture it as a variable, and take it into account when setting out the payoff matrix for the problem.

This means that, instead of Fred having to think "When I walked into this room I was determined to pick one-box.  As far as anyone knew or could predict, including myself, I intended to pick one-box.  However nothing I do now can change Omegas decision - the money is already in the box.  So I've nothing to lose by changing my mind."; Fred can now allocate a specific probability to whether Omega could predict Fred's chance of changing his mind in such circumstances, and Fred can take that into account in his strategy by making his chance of changing strategy explicit and external - basing it upon a random number device.

Or, to put it another way, we are modelling a rational human who has a specific finite chance of talking himself into over riding a pre-committed strategy, as being made up from two components: a component that will infallibly stick to a pre-committed strategy plus a component with a known chance of change; we then treat the combined rational human as being someone infallibly committed to a meta-strategy that includes a chance of change - a mixed equilibrium, from Omega's point of view.

Ok, time to look at the numbers and draw a pretty diagram...

Fred is Player A, and he has two meta options:
A1 - play it safe.  100% chance of one-box and 0% chance of two-box
A2 - take a risk.   Mentally pre-commit to using the device to produce 99% chance of one-box and 1% chance of two-box.

Omega is Player B, and he has two meta options:
B1 - reward risk.  Not attempt to distinguish between the mental state of Fred taking 1% risk and Fred playing it safe.
B2 - punish risk.  Attempt to distinguish and, if Omega guesses Fred is taking risk rather than playing safe, punish it.

We'll start out by assuming that if Omega does attempt to distinguish, then Omega will have a 1 in 10,000 false positive rate (thinking Fred is going to use the device, when actually Fred intends to play it safe) and a 1 in 10,000 false negative rate (thinking Fred is going to play it safe, when actually Fred intends to use the device).

### A1 vs B1

Fred gains \$1,000,000
Omega loses \$1,000,000 to Fred but gains \$1,000,000,000 from Alpha, for a net gain of \$999,000,000

### A2 vs B1

99% of the time, Fred gains \$1,000,000 and Omega net gains \$999,000,000
1% of the time, Fred gains \$1,001,000 and Omega net loses \$10,001,001,000

Combining those gives an average of:
Fred gains: \$1,000,010
Omega gains: \$979,008,999

### A1 vs B2

99.99% of the time, Omega correctly discerns that Fred is playing safe
Fred gains \$1,000,000
Omega gains \$999,000,000

0.01% of the time, Omega falsely believes that Fred is taking a risk, and punishes that by putting \$0 in Box A
Fred gains \$0
Omega loses \$10,000,000,000

Combining those gives an average of:
Fred gains: \$999,900
Omega gains: \$997,900,100

### A2 vs B2

In 100 trials out of 1,000,000 trials Omega incorrectly thinks Fred will play it safe, when actually Fred takes the risk of using the device.  Of these:

In 1 trial out of 1,000,000 trials: Omega incorrectly thinks Fred will play it safe, when actually Fred takes the risk of using the device and, in this case, the device picks two-box
==> Fred gains \$1,001,000
==> Omega loses \$10,001,001,000

In 99 trials out of 1,000,000 trials: Omega incorrectly thinks Fred will play it safe, when actually Fred takes the risk of using the device and, in this case, the device picks one-box
==> Fred gains \$1,000,000
==> Omega gains \$999,000,000

In 999,900 trials out of 1,000,000 trials Omega correctly thinks that Fred is going to take the risk of using the device.  Of those:

In 9,999 trials out of 1,000,000 trials: Omega correctly thinks that Fred is going to take the risk of using the device and, in this case, the device picks two-box
==> Fred gains \$1,000
==> Omega gains \$999,999,000

In 989,901 trials out of 1,000,000 trials: Omega correctly thinks that Fred is going to take the risk of using the device and, in this case, the device picks one-box
==> Fred gains \$0
==> Omega loses \$10,000,000,000

Combining those gives an average of:
Fred gains \$110
Omega loses \$9,888,922,110

If Fred's best estimate, beforehand, of the chances of Omega choosing meta strategy B1 is P, then

Fred's expected gain from choosing meta strategy A1 would be: \$1,000,000 P + \$999,999 (1-P)
Fred's expected gain from choosing meta strategy A2 would be: \$1,000,010 P + \$110 (1-P)

We can ask ourselves at what value of P these two strategies would return the same expected gain:

\$1,000,000 P + \$999,999 (1-P)   =    \$1,000,010 P + \$110 (1-P)
\$999,889 (1-P)   =                \$10 P
\$999,889           =        \$999,899 P
P    = 0.99999
(1-P)  = 0.00001

In other words, Fred thinking that Omega has even a 1 in 100,000 chance of being vindictive should be enough to deter Fred from taking the risky strategy.

But how does that look from Omega's point of view?   If Omega thinks that Fred's chance of picking meta strategy A1 is Q, then what is the cost to Omega of picking B2 1 in 100,000 times?

Omega's expected gain from choosing meta strategy B1 would be: \$999,000,000 Q + \$979,008,999 (1-Q)
Omega's expected gain from choosing meta strategy B2 would be: \$997,900,100 Q - \$9,888,922,110 (1-Q)

0.99999 { \$999,000,000 Q + \$979,008,999 (1-Q)  } + 0.00001 { \$997,900,100 Q - \$9,888,922,110 (1-Q) }
= (1 - 0.00001) { \$979,008,999 + \$19,991,001 Q } + 0.00001 { - \$9,888,922,110  + \$10,886,822,210 Q  }
= \$979,008,999 + \$19,991,001 Q + 0.00001 { - \$9,888,922,110  + \$10,886,822,210 Q - \$979,008,999 - \$19,991,001 Q }
= \$979,008,999 + \$19,991,001 Q + 0.00001 { \$9,907,813,211 + \$10,866,831,209 Q }
= ( \$979,008,999 + \$99,078.13211) + ( \$19,991,001 + \$108,668.31209 ) Q
= \$979,108,077 + \$20,099,669 Q

Perhaps a meta strategy of 1% chance of two-boxing is not Fred's optimal meta strategy.  Perhaps, at that level compared to Omega's ability to discern, it is still worth Omega investing in being vindictive occasionally, in order to deter Fred from taking risk.   But, given sufficient data about previous games, Fred can make a guess at Omega's ability to discern.  And, likewise Omega, by including in the record of past games occasions when Omega has falsely accused a human player of taking risk, can signal to future players where Omega's boundaries are.   We can plot graphs of these to find the point at which Fred's meta strategy and Omega's meta strategy are in equilibrium - where if Fred took any larger chances, it would start becoming worth Omega's while to punish risk sufficiently often that it would no longer be in Fred's interests to take the risk.   Precisely where that point is will depend on the numbers we picked in Part 1 of this sequence.  By exploring the space created by using each variable number as a dimension, we can divide it into regions characterised by which strategies dominate within that region.

Extrapolating that as δ tends towards 0 should then carry us closer to a convincing solution to Newcomb's Problem.

Back to Part 1 - stating the problem
Back to Part 2 - some mathematics
This is   Part 3 - towards a solution

## A solvable Newcomb-like problem - part 2 of 3

0 03 December 2012 04:49PM

This is the second part of a three post sequence on a problem that is similar to Newcomb's problem but is posed in terms of probabilities and limited knowledge.

Part 1 - stating the problem
Part 2 - some mathematics
Part 3 - towards a solution

In game theory, a payoff matrix is a way of presenting the results of two players simultaneously picking options.

For example, in the Prisoner's Dilemma, Player A gets to choose between option A1 (Cooperate) and option A2 (Defect) while, at the same time Player B gets to choose between option B1 (Cooperate) and option B2 (Defect).   Since years spent in prison are a negative outcome, we'll write them as negative numbers:

So, if you look at the bottom right hand corner, at the intersection of Player A defecting (A2) and Player B defecting (B2) we see that both players end up spending 4 years in prison.   Whereas, looking at the bottom left we see that if A defects and B cooperates, then Player A ends up spending 0 years in prison and Player B ends up spending 5 years in prison.

Another familiar example we can present in this form is the game Rock-Paper-Scissors.

We could write it as a zero sum game, with a win being worth 1, a tie being worth 0 and a loss being worth -1:

But it doesn't change the mathematics if we give both players 2 points each round just for playing, so that a win becomes worth 3 points, a tie becomes worth 2 points and a loss becomes worth 1 point.  (Think of it as two players in a game show being rewarded by the host, rather than the players making a direct bet with each other.)

If you are Player A, and you are playing against a Player B who always chooses option B1 (Rock), then your strategy is clear.  You choose option A2 (Paper) each time.  Over 10 rounds, you'd expect to end up with \$30 compared to B's \$10.

Let's imagine a slightly more sophisticated Player B, who always picks Rock in the first round, and then for all other rounds picks whatever would beat Player A's choice the previous round.   This strategy would do well against someone who always picked the same option each round, but it is deterministic and, if we guess it correctly in advance, we can design a strategy that beats it every time.  (In this case, picking Paper-Rock-Scissors then repeating back to Paper).   In fact whatever strategy B comes up with, if that strategy is deterministic and we guess it in advance, then we end up with \$30 and B ends up with \$10.

What if B has a deterministic strategy that B picked in advance and doesn't change, but we don't know at the start of the first round what it is?   In theory B might have picked any of the 3-to-the-power-of-10 deterministic strategies that are indistinguishable from each other over a 10 round duel but, in practice, humans tend to favour some strategies over others so, if you know humans and the game of Rock-Paper-Scissors better than Player B does, you have a better than even chance of guessing his pattern and coming out ahead in the later rounds of the duel.

But there's a danger to that.  What if you have overestimated your comparative knowledge level and Player B uses your overconfidence to lure you into thinking you've cracked B's pattern, while really B is laying a trap, increasing the predictability of Player A's moves so Player B can then take advantage of that to work out which moves will trump them?  This works better in a game like poker, where the stakes are not the same each round, but it is still possible in Rock-Paper-Scissors, and you can imagine variants of the game where the host varies payoff matrix by increasing the lose-tie-win rewards from 1,2,3 in the first round, to 2,4,6 in the second round, 3,6,9 in the third round, and so on.

This is why the safest strategy is to not to have a deterministic strategy but, instead, use a source of random bits to each round pick option 1 with a probability of 33%, option 2 with a probability of 33% or option 3 with a probability of 33% (modulo rounding).  You might not get to take advantage of any predictability that becomes apparent in your opponents strategy, but neither can you be fooled into becoming predictable yourself.

On a side note, this still applies even when there is only one round, because unaided humans are not as good at coming up with random bits as they think they are.  Someone who has observed many first time players will notice that first time players more often than not choose as their Rock as their 'random' first move, rather than Paper or Scissors.  If such a person were confident that they were playing a first time player, they might therefore pick Paper as their first move more frequently than not.  Things soon get very Sicilian (in the sense of the duel between Westley and Vizzini in the film The Princess Bride) after that, because a yet more sophisticated player who guessed their opponent would try this, could then pick Scissors.  And so ad infinitum, with ever more implausible levels of discernment being required to react on the next level up.

We can imagine a tournament set up between 100 players taken randomly from the expertise distribution of game players, each player submitting a python program that always plays the same first move, and for each of the remaining 9 rounds produces a move determined solely by the the moves so far in that duel.  The tournament organiser would then run every player's program once against the programs of each of the other 99 players, so on average each player would collect 99x10x2 = \$1,980

We could make things more complex by allowing the programs to use, as an input, how much money their opponent has won so far during the tournament; or iterate over running the tournament several times, to give each player an 'expertise' rating which the program in the following tournament could then use.  We could allow the tournament host to subtract from each player a sum of money depending upon the size of program that player submitted (and how much memory or cpu it used).   We could give each player a limited ration of random bits, so when facing a player with a higher expertise rating they might splurge and make their move on all 10 rounds completely random, and when facing a player with a lower expertise they might conserve their supply by trying to 'out think' them.

There are various directions we could take this, but the one I want to look at here is what happens when you make the payoff matrix asymmetric.  What happens if you make the game unfair, so not only does one player have more at stake than the other player, but the options are not even either, for example:

You still have the circular Rock-Paper-Scissors dynamic where:
If B chose B3, then A wants most to have chosen A1
If A chose A1, then B wants most to have chosen B2
If B chose B2, then A wants most to have chosen A3
If A chose A3, then B wants most to have chosen B1
If B chose B1, then A wants most to have chosen A2
If A chose A2, then B wants most to have chosen B3

so everything wins against at least one other option, and loses against at least one other option.   However Player B is clearly now in a better position, because B wins ties, and B's wins (a 9, an 8 and a 7) tend to be larger than A's wins (a 9, a 6 and a 6).

What should Player A do?  Is the optimal safe strategy still to pick each option with an equal weighting?

Well, it turns out the answer is: no, an equal weighting isn't the optimal response.   Neither is just picking the same 'best' option each time.  Instead what do you is pick your 'best' option a bit more frequently than an equal weighting would suggest, but not so much that the opponent can steal away that gain by reliably choosing the specific option that trumps yours.   Rather than duplicate material already well presented on the web, I will point you at two lecture courses on game theory that explain how to calculate the exact probability to assign to each option:

You do this by using the indifference theorem to arrive at a set of linear equations, which you can then solve to arrive at a mixed equilibrium where neither player increases their expected utility by altering the probability weightings they assign to their options.

## The TL;DR; points to take away

If you are competing in what is effectively a simultaneous option choice game, with a being who you suspect may have an equal or higher expertise to you at the game, you can nullify their advantage by picking a strategy that, each round chooses randomly (using a weighting) between the available options.

Depending upon the details of the payoff matrix, there may be one option that it makes sense for you to pick most of the time but, unless that option is strictly better than all your other choices no matter what option your opponent picks, there is still utility to gain from occasionally picking the other options in order to keep your opponent on their toes.

Back to Part 1 - stating the problem
This is  Part 2 - some mathematics
Next to Part 3 - towards a solution

## A solvable Newcomb-like problem - part 1 of 3

1 03 December 2012 09:26AM

This is the first part of a three post sequence on a problem that is similar to Newcomb's problem but is posed in terms of probabilities and limited knowledge.

Part 1 - stating the problem
Part 2 - some mathematics
Part 3 - towards a solution

Omega is an AI, living in a society of AIs, who wishes to enhance his reputation in that society for being successfully able to predict human actions.  Given some exchange rate between money and reputation, you could think of that as a bet between him and another AI, let's call it Alpha.  And since there is also a human involved, for the sake of clarity, to avoid using "you" all the time, I'm going to sometimes refer to the human using the name "Fred".

Omega tells Fred:

I'd like you to pick between two options, and I'm going to try to predict which option you're going to pick.
Option "one box" is to open only box A, and take any money inside it
Option "two box" is to open both box A and box B, and take any money inside them

but, before you pick your option, declare it, then open the box or boxes, there are three things you need to know.

Firstly, you need to know the terms of my bet with Alpha.

If Fred picks option "one box" then:
If box A contains \$1,000,000 and box B contains \$1,000 then Alpha pays Omega \$1,000,000,000
If box A contains \$0              and box B contains \$1,000 then Omega pays Alpha \$10,000,000,000
If anything else, then both Alpha and Omega pay Fred \$1,000,000,000,000

If Fred picks option "two box" then:
If box A contains \$1,000,000 and box B contains \$1,000 then Omega pays Alpha \$10,000,000,000
If box A contains \$0              and box B contains \$1,000 then Alpha pays Omega \$1,000,000,000
If anything else, then both Alpha and Omega pay Fred \$1,000,000,000,000

Secondly, you should know that I've already placed all the money in the boxes that I'm going to, and I can't change the contents of the boxes between now and when you do the opening, because Alpha is monitoring everything.  I've already made my prediction, using a model I've constructed of your likely reactions based upon your past actions.

You can use any method you like to choose between the two options, short of contacting another AI, but be warned that if my model predicted that you'll use a method which introduces too large a random element (such as tossing a coin) then, while I may lose my bet with Alpha, I'll certainly have made sure you won't win the \$1,000,000.  Similarly, if my model predicted that you'd make an outside bet with another human (let's call him George) to alter the value of winning \$1,001,000 from me I'd have also taken that into account.  (I say "human" by the way, because my bet with Alpha is about my ability to predict humans so if you contact another AI, such as trying to lay a side bet with Alpha to skim some of his winnings, that invalidates not only my game with you, but also my bet with Alpha and there are no winning to skim.)

And, third and finally, you need to know my track record in previous similar situations.

I've played this game 3,924 times over the past 100 years (ie since the game started), with humans picked at random from the full variety of the population.   The outcomes were:
3000 times players picked option "one box" and walked away with \$1,000,000
900  times players picked option "two box" and walked away with \$1,000
24 times players flipped a coin and or were otherwise too random.  Of those players:
12 players picked option "one box" and walked away with \$0
12 players picked option "two box" and walked away with \$1,000

Never has anyone ever ended up walking away with \$1,001,000 by picking option "two box".

Omega stops talking.   You are standing in a room containing two boxes, labelled "A" and "B", which are both currently closed.  Everything Omega said matches what you expected him to say, as the conditions of the game are always the same and are well known - you've talked with other human players (who confirmed it is legit) and listened to their advice.   You've not contacted any AIs, though you have read the published statement from Alpha that also confirms the terms of the bet and details of the monitoring.  You've not made any bets with other humans, even though your dad did offer to bet you a bottle of whiskey that you'd be one of them too smart alecky fools who walked away with only \$1,000.  You responded by pre-committing to keep any winnings you make between you and your banker, and to never let him know.

The only relevant physical object you've brought along is a radioactive decay based random number generator, that Omega would have been unable to predict the result of in advance, just in case you decide to use it as a factor in your choice.  It isn't a coin, giving only a 50% chance of "one box" and a 50% chance of "two box".   You can set arbitrary odds (tell it to generate a random integer between 0 and any positive integer you give it, up to 10 to the power of 100).   Omega said in his spiel the phrase "too large a random element" but didn't specify where that boundary was.

What do you do?   Or, given that such a situation doesn't exist yet, and we're talking about a Fred in a possible future, what advice would you give to Fred on how to choose, were he to ever end up in such a situation?

Pick "one box"?   Pick "two box"?   Or pick randomly between those two choices and, if so, at what odds?

And why?

Part 1 - stating the problem
next   Part 2 - some mathematics
Part 3 - towards a solution

## Is Omega Impossible? Can we even ask?

-8 24 October 2012 02:47PM

EDIT: I see by the karma bombing we can't even ask.  Why even call this part of the site "discussion?"

Some of the classic questions about an omnipotent god include

1. Can god make a square circle?
2. Can god create an immovable object?  And then move it?
Saints and philosophers wrestled with these issues back before there was television.  My recollection is that people who liked the idea of an omnipotent god would answer "omnipotence does not include the power to do nonsense" where they would generally include contradictions as nonsense.  So omnipotence can't square a circle, can't make 2=3, can't make an atom which is simultaneously lead and gold.

But where do the contradictions end and the merely difficult to conceive begin?  Can omnipotence make the ratio of the diameter to the circumference of a circle = 3, or 22/7?  Can omnipotence make sqrt(2)=1.4 or 2+2=5?  While these are not directly self-contradictory statements, they can be used with a variety of simple truths to quickly derive self-contradictory statements.  Can we then conclude that "2+2=5" is essentially a contradiction because it is close to a contradiction?  Where do we draw the line?

What if were set some problem where we are told to assume that
1. 2+2 = 5
2. 1+1 = 2
3. 1+1+1+1+1 = 5
In solving this set problem, we can quickly derive that 1=0, and use that to prove effectively anything we want to prove.  Perhaps not formally, but we have violated the "law of the excluded middle," that either a statement is true or its negation is.  Once you violate that, you can prove ANYTHING using simple laws of inference, because you have propositions that are true and false.

What if we set a problem where we are told to assume
1. Omega is an infallible intelligence that does not lie
2. Omega tells you 2+2=5
Well, we are going to have the same problem as above, we will be able to prove anything.

Newcomb's Problem

In Newcomb's box problem, we are told to assume that
1. Omega is an infallible intelligence
2. Omega has predicted correctly whether we will one box or two box.
From these assumptions we wind up with all sorts of problems of causality and/or free will and/or determinism.

What if these statements are not consistent?  What if these statements are tantamount to assuming 0=1, or are within a few steps of assuming 0=1?  Or something just as contradictory, but harder to identify?

Personally, I can think of LOTS of reasons to doubt that Newcomb's problem is even theoretically possible to set.  Beyond that, I can think that the empirical barrier to believing Omega exists in reality would be gigantic, millions of humans have watched magic shows performed by non-superior intelligences where cards we have signed have turned up in a previously sealed envelope or wallet or audience member's pocket.  We recognize that these are tricks, that they are not what they appear.

To question Omega is not playing by the mathematician's or philosopher's rules.  But when we play by the rules, do we blithely assume 2+2=5 and then wrap ourselves around the logical axle trying to program a friendly AI to one-box?  Why is questioning Omega's possibility of existence, or possibility of proof of existence out-of-bounds?

## Omega lies

7 24 October 2012 10:46AM

Just developing my second idea at the end of my last post. It seems to me that in the Newcomb problem and in the counterfactual mugging, the completely trustworthy Omega lies to a greater or lesser extent.

This is immediately obvious in scenarios where Omega simulates you in order to predict your reaction. In the Newcomb problem, the simulated you is told "I have already made my decision...", which is not true at that point, and in the counterfactual mugging, whenever the coin comes up heads, the simulated you is told "the coin came up tails". And the arguments only go through because these lies are accepted by the simulated you as being true.

If Omega doesn't simulate you, but uses other methods to gauge your reactions, he isn't lying to you per se. But he is estimating your reaction in the hypothetical situation where you were fed untrue information that you believed to be true. And that you believed to be true, specifically because the source is Omega, and Omega is trustworthy.

Doesn't really change much to the arguments here, but it's a thought worth bearing in mind.

## Naive TDT, Bayes nets, and counterfactual mugging

15 23 October 2012 03:58PM

I set out to understand precisely why naive TDT (possibly) fails the counterfactual mugging problem. While doing this I ended up drawing a lot of Bayes nets, and seemed to gain some insight; I'll pass these on, in the hopes that they'll be useful. All errors are, of course, my own.

## The grand old man of decision theory: the Newcomb problem

First let's look at the problem that inspired all this research: the Newcomb problem. In this problem, a supremely-insightful-and-entirely-honest superbeing called Omega presents two boxes to you, and tells you that you can either choose box A only ("1-box"), or take box A and box B ("2-box"). Box B will always contain \$1K (one thousand dollars). Omega has predicted what your decision will be, though, and if you decided to 1-box, he's put \$1M (one million dollars) in box A; otherwise he's put nothing in it. The problem can be cast as a Bayes net with the following nodes:

## Can anyone explain to me why CDT two-boxes?

-12 02 July 2012 06:06AM

I have read lots of LW posts on this topic, and everyone seems to take this for granted without giving a proper explanation. So if anyone could explain this to me, I would appreciate that.

This is a simple question that is in need of a simple answer. Please don't link to pages and pages of theorycrafting. Thank you.

Edit: Since posting this, I have come to the conclusion that CDT doesn't actually play Newcomb. Here's a disagreement with that statement:

If you write up a CDT algorithm and then put it into a Newcomb's problem simulator, it will do something. It's playing the game; maybe not well, but it's playing.

And here's my response:

The thing is, an actual Newcomb simulator can't possibly exist because Omega doesn't exist. There are tons of workarounds, like using coin tosses as a substitution for Omega and ignoring the results whenever the coin was wrong, but that is something fundamentally different from Newcomb.

You can only simulate Newcomb in theory, and it is perfectly possible to just not play a theoretical game, if you reject the theory it is based on. In theoretical Newcomb, CDT doesn't care about the rule of Omega being right, so CDT does not play Newcomb.

If you're trying to simulate Newcomb in reality by substituting Omega with someone who has only empirically been proven right, you substitute Newcomb with a problem that consists of little more than simple calculation of priors and payoffs, and that's hardly the point here.

Edit 2: Clarification regarding backwards causality, which seems to confuse people:

Newcomb assumes that Omega is omniscient, which more importantly means that the decision you make right now determines whether Omega has put money in the box or not. Obviously this is backwards causality, and therefore not possible in real life, which is why Nozick doesn't spend too much ink on this.

But if you rule out the possibility of backwards causality, Omega can only make his prediction of your decision based on all your actions up to the point where it has to decide whether to put money in the box or not. In that case, if you take two people who have so far always acted (decided) identical, but one will one-box while the other one will two-box, Omega cannot make different predictions for them. And no matter what prediction Omega makes, you don't want to be the one who one-boxes.

Edit 3: Further clarification on the possible problems that could be considered Newcomb:

There's four types of Newcomb problems:

1. Omniscient Omega (backwards causality) - CDT rejects this case, which cannot exist in reality.
2. Fallible Omega, but still backwards causality - CDT rejects this case, which cannot exist in reality.
3. Infallible Omega, no backwards causality - CDT correctly two-boxes. To improve payouts, CDT would have to have decided differently in the past, which is not decision theory anymore.
4. Fallible Omega, no backwards causality - CDT correctly two-boxes. To improve payouts, CDT would have to have decided differently in the past, which is not decision theory anymore.

That's all there is to it.

Edit 4: Excerpt from Nozick's "Newcomb's Problem and Two Principles of Choice":

Now, at last, to return to Newcomb's example of the predictor. If one believes, for this case, that there is backwards causality, that your choice causes the money to be there or not, that it causes him to have made the prediction that he made, then there is no problem. One takes only what is in the second box. Or if one believes that the way the predictor works is by looking into the future; he, in some sense, sees what you are doing, and hence is no more likely to be wrong about what you do than someone else who is standing there at the time and watching you, and would normally see you, say, open only one box, then there is no problem. You take only what is in the second box. But suppose we establish or take as given that there is no backwards causality, that what you actually decide to do does not affect what he did in the past, that what you actually decide to do is not part of the explanation of why he made the prediction he made. So let us agree that the predictor works as follows: He observes you sometime before you are faced with the choice, examines you with complicated apparatus, etc., and then uses his theory to predict on the basis of this state you were in, what choice you would make later when faced with the choice. Your deciding to do as you do is not part of the explanation of why he makes the prediction he does, though your being in a certain state earlier, is part of the explanation of why he makes the prediction he does, and why you decide as you do.

I believe that one should take what is in both boxes. I fear that the considerations I have adduced thus far will not convince those proponents of taking only what is in the second box. Furthermore I suspect that an adequate solution to this problem will go much deeper than I have yet gone or shall go in this paper. So I want to pose one question. I assume that it is clear that in the vaccine example, the person should not be convinced by the probability argument, and should choose the dominant action. I assume also that it is clear that in the case of the two brothers, the brother should not be convinced by the probability argument offered. The question I should like to put to proponents of taking only what is in the second box in Newcomb's example (and hence not performing the dominant action) is: what is the difference between Newcomb's example and the other two examples which make the difference between not following the dominance principle, and following it?

## Extremely Counterfactual Mugging or: the gist of Transparent Newcomb

4 09 February 2011 03:20PM

Omega will either award you \$1000 or ask you to pay him \$100. He will award you \$1000 if he predicts you would pay him if he asked. He will ask you to pay him \$100 if he predicts you wouldn't pay him if he asked.

Omega asks you to pay him \$100. Do you pay?

This problem is roughly isomorphic to the branch of Transparent Newcomb (version 1, version 2) where box B is empty, but it's simpler.

Here's a diagram:

## Omega can be replaced by amnesia

15 26 January 2011 12:31PM

Let's play a game. Two times, I will give you an amnesia drug and let you enter a room with two boxes inside. Because of the drug, you won't know whether this is the first time you've entered the room. On the first time, both boxes will be empty. On the second time, box A contains \$1000, and Box B contains \$1,000,000 iff this is the second time and you took only box B the first time. You're in the room, do take both boxes or only box B?

This is equivalent to Newcomb's Problem in the sense that any strategy does equally well on both, where by "strategy" I mean a mapping from info to (probability distributions over) actions.

I suspect that any problem with Omega can be transformed into an equivalent problem with amnesia instead of Omega.

Does CDT return the winning answer in such transformed problems?

Discuss.