You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

Omega's Idiot Brother, Epsilon

3 OrphanWilde 25 November 2015 07:57PM

Epsilon walks up to you with two boxes, A and b, labeled in rather childish-looking handwriting written in crayon.

"In box A," he intones, sounding like he's trying to be foreboding, which might work better when he hits puberty, "I may or may not have placed a million of your human dollars."  He pauses for a moment, then nods.  "Yes.  I may or may not have placed a million dollars in this box.  If I expect you to open Box B, the million dollars won't be there.  Box B will contain, regardless of what you do, one thousand dollars.  You may choose to take one box, or both; I will leave with any boxes you do not take."

You've been anticipating this.  He's appeared to around twelve thousand people so far.  Out of eight thousand people who accepted both boxes, eighty found the million dollars missing, and walked away with $1,000; the other seven thousand nine hundred and twenty people walked away with $1,001,000 dollars.  Out of the four thousand people who opened only box A, only four found it empty.

The agreement is unanimous: Epsilon is really quite bad at this.  So, do you one-box, or two-box?


There are some important differences here with the original problem.  First, Epsilon won't let you open either box until you've decided whether to open one or both, and will leave with the other box.  Second, while Epsilon's false positive rate on identifying two-boxers is quite impressive, making mistakes about one-boxers only .1% of the time, his false negative rate is quite unimpressive - he catches 1% of everybody who engages in it.  Whatever heuristic he's using, clearly, he prefers to let two-boxers slide than to accidentally punish one-boxers.

I'm curious to know whether anybody would two-box in this scenario and why, and particularly curious in the reasoning of anybody whose answer is different between the original Newcomb problem and this one.

Newcomb, Bostrom, Calvin: Credence and the strange path to a finite afterlife

7 crmflynn 02 November 2015 11:03PM

This is a bit rough, but I think that it is an interesting and potentially compelling idea. To keep this short, and accordingly increase the number of eyes over it, I have only sketched the bare bones of the idea. 

     1)      Empirically, people have varying intuitions and beliefs about causality, particularly in Newcomb-like problems (http://wiki.lesswrong.com/wiki/Newcomb's_problemhttp://philpapers.org/surveys/results.pl, and https://en.wikipedia.org/wiki/Irresistible_grace).

     2)      Also, as an empirical matter, some people believe in taking actions after the fact, such as one-boxing, or Calvinist “irresistible grace”, to try to ensure or conform with a seemingly already determined outcome. This might be out of a sense of retrocausality, performance, moral honesty, etc. What matters is that we know that they will act it out, despite it violating common sense causality. There has been some great work on decision theory on LW about trying to thread this needle well.

     3)      The second disjunct of the simulation argument (http://wiki.lesswrong.com/wiki/Simulation_argument) shows that the decision making of humanity is evidentially relevant in what our subjective credence should be that we are in a simulation. That is to say, if we are actively headed toward making simulations, we should increase our credence of being in a simulation, if we are actively headed away from making simulations, through either existential risk or law/policy against it, we should decrease our credence.

      4)      Many, if not most, people would like for there to be a pleasant afterlife after death, especially if we could be reunited with loved ones.

     5)      There is no reason to believe that simulations which are otherwise nearly identical copies of our world, could not contain, after the simulated bodily death of the participants, an extremely long-duration, though finite, "heaven"-like afterlife shared by simulation participants.

     6)      Our heading towards creating such simulations, especially if they were capable of nesting simulations, should increase credence that we exist in such a simulation and should perhaps expect a heaven-like afterlife of long, though finite, duration.

     7)      Those who believe in alternative causality, or retrocausality, in Newcomb-like situations should be especially excited about the opportunity to push the world towards surviving, allowing these types of simulations, and creating them, as it would potentially suggest, analogously, that if they work towards creating simulations with heaven-like afterlives, that they might in some sense be “causing” such a heaven to exist for themselves, and even for friends and family who have already died. Such an idea of life-after-death, and especially for being reunited with loved ones, can be extremely compelling.

     8)      I believe that people matching the above description, that is, holding both an intuition in alternative causality, and finding such a heaven-like-afterlife compelling, exist. Further, the existence of such people, and their associated motivation to try to create such simulations, should increase the credence even of two-boxing types, that we already live in such a world with a heaven-like afterlife. This is because knowledge of a motivated minority desiring simulations should increase credence in the likely success of simulations. This is essentially showing that “this probably happened before, one level up” from the two-box perspective.

     9)      As an empirical matter, I also think that there are people who would find the idea of creating simulations with heaven-like afterlives compelling, even if they are not one-boxers, from a simply altruistic perspective, both since it is a nice thing to do for the future sim people, who can, for example, probabilistically have a much better existence than biological children on earth can, and as it is a nice thing to do to increase the credence (and emotional comfort) of both one-boxers and two-boxers in our world thinking that there might be a life after death.

     10)   This creates the opportunity for a secular movement in which people work towards creating these simulations, and use this work and potential success in order to derive comfort and meaning from their life. For example, making donations to a simulation-creating or promoting, or existential threat avoiding, think-tank after a loved one’s death, partially symbolically, partially hopefully.

     11)   There is at least some room for Pascalian considerations even for two-boxers who allow for some humility in their beliefs. Nozick believed one-boxers will become two boxers if Box A is raised to 900,000, and two-boxers will become one-boxers if Box A is lowered to $1. Similarly, trying to work towards these simulations, even if you do not find it altruistically compelling, and even if you think that the odds of alternative or retrocausality is infinitesimally small, might make sense in that the reward could be extremely large, including potentially trillions of lifetimes worth of time spent in an afterlife “heaven” with friends and family.

Finally, this idea might be one worth filling in (I have been, in my private notes for over a year, but am a bit shy to debut that all just yet, even working up the courage to post this was difficult) if only because it is interesting, and could be used as a hook to get more people interested in existential risk, including the AI control problem. This is because existential catastrophe is probably the best enemy of credence in the future of such simulations, and accordingly in our reasonable credence in thinking that we have such a heaven awaiting us after death now. A short hook headline like “avoiding existential risk is key to afterlife” can get a conversation going. I can imagine Salon, etc. taking another swipe at it, and in doing so, creating publicity which would help in finding more similar minded folks to get involved in the work of MIRI, FHI, CEA etc. There are also some really interesting ideas about acausal trade, and game theory between higher and lower worlds, as a form of “compulsion” in which they punish worlds for not creating heaven containing simulations (therefore effecting their credence as observers of the simulation), in order to reach an equilibrium in which simulations with heaven-like afterlives are universal, or nearly universal. More on that later if this is received well.

Also, if anyone would like to join with me in researching, bull sessioning, or writing about this stuff, please feel free to IM me. Also, if anyone has a really good, non-obvious pin with which to pop my balloon, preferably in a gentle way, it would be really appreciated. I am spending a lot of energy and time on this if it is fundamentally flawed in some way.

Thank you.

*******************************

November 11 Updates and Edits for Clarification

     1)      There seems to be confusion about what I mean by self-location and credence. A good way to think of this is the Sleeping Beauty Problem (https://wiki.lesswrong.com/wiki/Sleeping_Beauty_problem)

If I imagine myself as Sleeping Beauty (and who doesn’t?), and I am asked on Sunday what my credence is that the coin will be tails, I will say 1/2. If I am awakened during the experiment without being told which day it is and am asked what my credence is that the coin was tails, I will say 2/3. If I am then told it is Monday, I will update my credence to ½. If I am told it is Tuesday I update my credence to 1. If someone asks me two days after the experiment about my credence of it being tails, if I somehow do not know the days of the week still, I will say ½. Credence changes with where you are, and with what information you have. As we might be in a simulation, we are somewhere in the “experiment days” and information can help orient our credence. As humanity potentially has some say in whether or not we are in a simulation, information about how humans make decisions about these types of things can and should effect our credence.

Imagine Sleeping Beauty is a lesswrong reader. If Sleeping Beauty is unfamiliar with the simulation argument, and someone asks her about her credence of being in a simulation, she probably answers something like 0.0000000001% (all numbers for illustrative purposes only). If someone shows her the simulation argument, she increases to 1%. If she stumbles across this blog entry, she increases her credence to 2%, and adds some credence to the additional hypothesis that it may be a simulation with an afterlife. If she sees that a ton of people get really interested in this idea, and start raising funds to build simulations in the future and to lobby governments both for great AI safeguards and for regulation of future simulations, she raises her credence to 4%. If she lives through the AI superintelligence explosion and simulations are being built, but not yet turned on, her credence increases to 20%. If humanity turns them on, it increases to 50%. If there are trillions of them, she increases her credence to 60%. If 99% of simulations survive their own run-ins with artificial superintelligence and produce their own simulations, she increases her credence to 95%. 

2)  This set of simulations does not need to recreate the current world or any specific people in it. That is a different idea that is not necessary to this argument. As written the argument is premised on the idea of creating fully unique people. The point would be to increase our credence that we are functionally identical in type to the unique individuals in the simulation. This is done by creating ignorance or uncertainty in simulations, so that the majority of people similarly situated, in a world which may or may not be in a simulation, are in fact in a simulation. This should, in our ignorance, increase our credence that we are in a simulation. The point is about how we self-locate, as discussed in the original article by Bostrom. It is a short 12-page read, and if you have not read it yet, I would encourage it:  http://simulation-argument.com/simulation.html. The point about past loved ones I was making was to bring up the possibility that the simulations could be designed to transfer people to a separate after-life simulation where they could be reunited after dying in the first part of the simulation. This was not about trying to create something for us to upload ourselves into, along with attempted replicas of dead loved ones. This staying-in-one simulation through two phases, a short life, and relatively long afterlife, also has the advantage of circumventing the teletransportation paradox as “all of the person" can be moved into the afterlife part of the simulation.  

 

The Interrupted Ultimate Newcomb's Problem

3 linkhyrule5 10 September 2013 11:04PM

While figuring out my error in my solution to the Ultimate Newcomb's Problem, I ran across this (distinct) reformulation that helped me distinguish between what I was doing and what the problem was actually asking.

... but that being said, I'm not sure if my answer to the reformulation is correct either.

 

The question, cleaned for Discussion, looks like this:

You approach the boxes and lottery, which are exactly as in the UNP. Before reaching it, you come to sign with a flashing red light. The sign reads: "INDEPENDENT SCENARIO BEGIN."

Omega, who has predicted that you will be confused, shows up to explain: "This is considered an artificially independent experiment. Your algorithm for solving this problem will not be used in my simulations of your algorithm for my various other problems. In other words, you are allowed to two-box here but one-box Newcomb's problem, or vice versa."

This is motivated by the realization that I've been making the same mistake as in the original Newcomb's Problem, though this justification does not (I believe) apply to the original.  The mistake is simply this: that I assumed that I simply appear in medias res. When solving the UNP, it is (seems to be) important to remember that you may be in some very rare edge case of the main problem, and that you are choosing your algorithm for the problem as a whole.

But if that's not true - if you're allowed to appear in the middle of the problem, and no counterfactual-yous are at risk - it sure seems like two-boxing is justified - as khafra put it, "trying to ambiently control basic arithmetic".

 

(Speaking of which, is there a write up of ambient decision theory anywhere? For that matter, is there any compilation of decision theories?)

 

EDIT: (Yes to the first, though not under that name: Controlling Constant Programs.)

The Ultimate Newcomb's Problem

18 Eliezer_Yudkowsky 10 September 2013 02:03AM

You see two boxes and you can either take both boxes, or take only box B. Box A is transparent and contains $1000. Box B contains a visible number, say 1033.  The Bank of Omega, which operates by very clear and transparent mechanisms, will pay you $1M if this number is prime, and $0 if it is composite. Omega is known to select prime numbers for Box B whenever Omega predicts that you will take only Box B; and conversely select composite numbers if Omega predicts that you will take both boxes. Omega has previously predicted correctly in 99.9% of cases.

Separately, the Numerical Lottery has randomly selected 1033 and is displaying this number on a screen nearby. The Lottery Bank, likewise operating by a clear known mechanism, will pay you $2 million if it has selected a composite number, and otherwise pay you $0.  (This event will take place regardless of whether you take only B or both boxes, and both the Bank of Omega and the Lottery Bank will carry out their payment processes - you don't have to choose one game or the other.)

You previously played the game with Omega and the Numerical Lottery a few thousand times before you ran across this case where Omega's number and the Lottery number were the same, so this event is not suspicious.

Omega also knew the Lottery number before you saw it, and while making its prediction, and Omega likewise predicts correctly in 99.9% of the cases where the Lottery number happens to match Omega's number.  (Omega's number is chosen independently of the lottery number, however.)

You have two minutes to make a decision, you don't have a calculator, and if you try to factor the number you will be run over by the trolley from the Ultimate Trolley Problem.

Do you take only box B, or both boxes?

Why do theists, undergrads, and Less Wrongers favor one-boxing on Newcomb?

15 CarlShulman 19 June 2013 01:55AM

Follow-up to: Normative uncertainty in Newcomb's problem

Philosophers and atheists break for two-boxing; theists and Less Wrong break for one-boxing
Personally, I would one-box on Newcomb's Problem. Conditional on one-boxing for lawful reasons, one boxing earns $1,000,000, while two-boxing, conditional on two-boxing for lawful reasons, would deliver only a thousand. But this seems to be firmly a minority view in philosophy, and numerous heuristics about expert opinion suggest that I should re-examine the view.

In the PhilPapers survey, Philosophy undergraduates start off divided roughly evenly between one-boxing and two-boxing:

Newcomb's problem: one box or two boxes?

Other 142 / 217 (65.4%)
Accept or lean toward: one box 40 / 217 (18.4%)
Accept or lean toward: two boxes 35 / 217 (16.1%)

But philosophy faculty, who have learned more (less likely to have no opinion), and been subject to further selection, break in favor of two-boxing:

Newcomb's problem: one box or two boxes?

Other 441 / 931 (47.4%)
Accept or lean toward: two boxes 292 / 931 (31.4%)
Accept or lean toward: one box 198 / 931 (21.3%)

Specialists in decision theory (who are also more atheistic, more compatibilist about free will, and more physicalist than faculty in general) are even more convinced:

Newcomb's problem: one box or two boxes?

Accept or lean toward: two boxes 19 / 31 (61.3%)
Accept or lean toward: one box 8 / 31 (25.8%)
Other 4 / 31 (12.9%)

Looking at the correlates of answers about Newcomb's problem, two-boxers are more likely to believe in physicalism about consciousness, atheism about religion, and other positions generally popular around here (which are also usually, but not always, in the direction of philosophical opinion). Zooming in one correlate, most theists with an opinion are one-boxers, while atheists break for two-boxing:

Newcomb's problem:two boxes 0.125
  one box two boxes
atheism
28.6% (145/506)
48.8% (247/506)
theism
40.8% (40/98)
31.6% (31/98)
Response pairs: 655   p-value: 0.001

Less Wrong breaks overwhelmingly for one-boxing in survey answers for 2012:

NEWCOMB'S PROBLEM
One-box: 726, 61.4%
Two-box: 78, 6.6%
Not sure: 53, 4.5%
Don't understand: 86, 7.3%
No answer: 240, 20.3%

When I elicited LW confidence levels in a poll, a majority indicated 99%+ confidence in one-boxing, and 77% of respondents indicated 80%+ confidence.

What's going on?

I would like to understand what is driving this difference of opinion. My poll was a (weak) test of the hypothesis that Less Wrongers were more likely to account for uncertainty about decision theory: since on the standard Newcomb's problem one-boxers get $1,000,000, while two-boxers get $1,000, even a modest credence in the correct theory recommending one-boxing could justify the action of one-boxing.

If new graduate students read the computer science literature on program equilibrium, including some local contributions like Robust Cooperation in the Prisoner's Dilemma and A Comparison of Decision Algorithms on Newcomblike Problems, I would guess they would tend to shift more towards one-boxing. Thinking about what sort of decision algorithms it is rational to program, or what decision algorithms would prosper over numerous one-shot Prisoner's Dilemmas with visible source code, could also shift intuitions. A number of philosophers I have spoken with have indicated that frameworks like the use of causal models with nodes for logical uncertainty are meaningful contributions to thinking about decision theory. However, I doubt that for those with opinions, the balance would swing from almost 3:1 for two-boxing to 9:1 for one-boxing, even concentrating on new decision theory graduate students.

On the other hand, there may be an effect of unbalanced presentation to non-experts. Less Wrong is on average less philosophically sophisticated than professional philosophers. Since philosophical training is associated with a shift towards two-boxing, some of the difference in opinion could reflect a difference in training. Then, postings on decision theory have almost all either argued for or assumed one-boxing as the correct response on Newcomb's problem. It might be that if academic decision theorists were making arguments for two-boxing here, or if there was a reduction in pro one-boxing social pressure, there would be a shift in Less Wrong opinion towards two-boxing.

Less Wrongers, what's going on here? What are the relative causal roles of these and other factors in this divergence?

ETA: The SEP article on Causal Decision Theory.

Simulating Problems

1 Andreas_Giger 30 January 2013 01:14PM

Apologies for the rather mathematical nature of this post, but it seems to have some implications for topics relevant to LW. Prior to posting I looked for literature on this but was unable to find any; pointers would be appreciated.

In short, my question is: How can we prove that any simulation of a problem really simulates the problem?

I want to demonstrate that this is not as obvious as it may seem by using the example of Newcomb's Problem. The issue here is of course Omega's omniscience. If we construct a simulation with the rules (payoffs) of Newcomb, an Omega that is always right, and an interface for the agent to interact with the simulation, will that be enough?

Let's say we simulate Omega's prediction by a coin toss and repeat the simulation (without payoffs) until the coin toss matches the agent's decision. This seems to adhere to all specifications of Newcomb and is (if the coin toss is hidden) in fact indistinguishable from it from the agent's perspective. However, if the agent knows how the simulation works, a CDT agent will one-box, while it is assumed that the same agent would two-box in 'real' Newcomb. Not telling the agent how the simulation works is never a solution, so this simulation appears to not actually simulate Newcomb.

Pointing out differences is of course far easier than proving that none exist. Assuming there's a problem we have no idea which decisions agents would make, and we want to build a real-world simulation to find out exactly that. How can we prove that this simulation really simulates the problem?

 

(Edit: Apparently it wasn't apparent that this is about problems in terms of game theory and decision theory. Newcomb, Prisoner's Dilemma, Iterated Prisoner's Dilemma, Monty Hall, Sleeping Beauty, Two Envelopes, that sort of stuff. Should be clear now.)

A solvable Newcomb-like problem - part 3 of 3

3 Douglas_Reay 06 December 2012 01:06PM

This is the third part of a three post sequence on a problem that is similar to Newcomb's problem but is posed in terms of probabilities and limited knowledge.

   Part 1 - stating the problem
   Part 2 - some mathematics
   Part 3 - towards a solution

 


 

In many situations we can say "For practical purposes a probability of 0.9999999999999999999 is close enough to 1 that for the sake of simplicity I shall treat it as being 1, without that simplification altering my choices."

However, there are some situations where the distinction does significantly alter that character of a situation so, when one is studying a new situation and one is not sure yet which of those two categories the situations falls into, the cautious approach is to re-frame the probability as being (1 - δ) where δ is small (eg 10 to the power of -12), and then examine the characteristics of the behaviour as δ tends towards 0.

LessWrong wiki describes Omega as a super-powerful AI analogous to Laplace's demon, who knows the precise location and momentum of every atom in the universe, limited only by the laws of physics (so, if time travel isn't possible and some of our current thoughts on Quantum Mechanics are correct, then Omega's knowledge of the future is probabilistic, being limited by uncertainty).

For the purposes of Newcomb's problem, and the rationality of Fred's decisions, it doesn't matter how close to that level of power Omega actually is.   What matters, in terms of rationality, is the evidence available to Fred about how close Omega is to having to that level of power; or, more precisely, the evidence available to Fred relevant to Fred making predictions about Omega's performance in this particular game.

Since this is a key factor in Fred's decision, we ought to be cautious.  Rather than specify when setting up the problem that Fred knows with a certainty of 1 that Omega does have that power, it is better to specify a concrete level of evidence that would lead Fred to assign a probability of (1 - δ) to Omega having that power, then examine the effect upon which option to the box problem it is rational for Fred to pick, as δ tends towards 0.

The Newcomb-like problem stated in part 1 of this sequence contains an Omega that it is rational for Fred to assign a less than unity probability of being able to perfectly predict Fred's choices.  By using bets as analogies to the sort of evidence Fred might have available to him, we create an explicit variable that we can then manipulate to alter the precise probability Fred assigns to Omega's abilities.

The other nice feature of the Newcomb-like problem given in part 1, is that it is explicitly solvable using the mathematics given in part 2.  By making randomness an external feature (the device Fred brings with him) rather than purely a feature of Fred's internal mind, we can acknowledge the question of Omega being able to predict quantum events, capture it as a variable, and take it into account when setting out the payoff matrix for the problem.

This means that, instead of Fred having to think "When I walked into this room I was determined to pick one-box.  As far as anyone knew or could predict, including myself, I intended to pick one-box.  However nothing I do now can change Omegas decision - the money is already in the box.  So I've nothing to lose by changing my mind."; Fred can now allocate a specific probability to whether Omega could predict Fred's chance of changing his mind in such circumstances, and Fred can take that into account in his strategy by making his chance of changing strategy explicit and external - basing it upon a random number device.

Or, to put it another way, we are modelling a rational human who has a specific finite chance of talking himself into over riding a pre-committed strategy, as being made up from two components: a component that will infallibly stick to a pre-committed strategy plus a component with a known chance of change; we then treat the combined rational human as being someone infallibly committed to a meta-strategy that includes a chance of change - a mixed equilibrium, from Omega's point of view.

 

Ok, time to look at the numbers and draw a pretty diagram...

Fred is Player A, and he has two meta options:
  A1 - play it safe.  100% chance of one-box and 0% chance of two-box
  A2 - take a risk.   Mentally pre-commit to using the device to produce 99% chance of one-box and 1% chance of two-box.

Omega is Player B, and he has two meta options:
  B1 - reward risk.  Not attempt to distinguish between the mental state of Fred taking 1% risk and Fred playing it safe.
  B2 - punish risk.  Attempt to distinguish and, if Omega guesses Fred is taking risk rather than playing safe, punish it.

We'll start out by assuming that if Omega does attempt to distinguish, then Omega will have a 1 in 10,000 false positive rate (thinking Fred is going to use the device, when actually Fred intends to play it safe) and a 1 in 10,000 false negative rate (thinking Fred is going to play it safe, when actually Fred intends to use the device).

A1 vs B1

Fred gains $1,000,000
Omega loses $1,000,000 to Fred but gains $1,000,000,000 from Alpha, for a net gain of $999,000,000

A2 vs B1

99% of the time, Fred gains $1,000,000 and Omega net gains $999,000,000
1% of the time, Fred gains $1,001,000 and Omega net loses $10,001,001,000

Combining those gives an average of:
Fred gains: $1,000,010
Omega gains: $979,008,999

A1 vs B2

99.99% of the time, Omega correctly discerns that Fred is playing safe
Fred gains $1,000,000
Omega gains $999,000,000

0.01% of the time, Omega falsely believes that Fred is taking a risk, and punishes that by putting $0 in Box A
Fred gains $0
Omega loses $10,000,000,000

Combining those gives an average of:
Fred gains: $999,900
Omega gains: $997,900,100

A2 vs B2

In 100 trials out of 1,000,000 trials Omega incorrectly thinks Fred will play it safe, when actually Fred takes the risk of using the device.  Of these:

In 1 trial out of 1,000,000 trials: Omega incorrectly thinks Fred will play it safe, when actually Fred takes the risk of using the device and, in this case, the device picks two-box
==> Fred gains $1,001,000
==> Omega loses $10,001,001,000

In 99 trials out of 1,000,000 trials: Omega incorrectly thinks Fred will play it safe, when actually Fred takes the risk of using the device and, in this case, the device picks one-box
==> Fred gains $1,000,000
==> Omega gains $999,000,000

In 999,900 trials out of 1,000,000 trials Omega correctly thinks that Fred is going to take the risk of using the device.  Of those:

In 9,999 trials out of 1,000,000 trials: Omega correctly thinks that Fred is going to take the risk of using the device and, in this case, the device picks two-box
==> Fred gains $1,000
==> Omega gains $999,999,000

In 989,901 trials out of 1,000,000 trials: Omega correctly thinks that Fred is going to take the risk of using the device and, in this case, the device picks one-box
==> Fred gains $0
==> Omega loses $10,000,000,000

Combining those gives an average of:
Fred gains $110
Omega loses $9,888,922,110

 

If Fred's best estimate, beforehand, of the chances of Omega choosing meta strategy B1 is P, then

Fred's expected gain from choosing meta strategy A1 would be: $1,000,000 P + $999,999 (1-P)
Fred's expected gain from choosing meta strategy A2 would be: $1,000,010 P + $110 (1-P)

We can ask ourselves at what value of P these two strategies would return the same expected gain:

$1,000,000 P + $999,999 (1-P)   =    $1,000,010 P + $110 (1-P)
                       $999,889 (1-P)   =                $10 P
                       $999,889           =        $999,899 P
                                          P    = 0.99999
                                      (1-P)  = 0.00001

In other words, Fred thinking that Omega has even a 1 in 100,000 chance of being vindictive should be enough to deter Fred from taking the risky strategy.

But how does that look from Omega's point of view?   If Omega thinks that Fred's chance of picking meta strategy A1 is Q, then what is the cost to Omega of picking B2 1 in 100,000 times?

Omega's expected gain from choosing meta strategy B1 would be: $999,000,000 Q + $979,008,999 (1-Q)
Omega's expected gain from choosing meta strategy B2 would be: $997,900,100 Q - $9,888,922,110 (1-Q)

0.99999 { $999,000,000 Q + $979,008,999 (1-Q)  } + 0.00001 { $997,900,100 Q - $9,888,922,110 (1-Q) }
= (1 - 0.00001) { $979,008,999 + $19,991,001 Q } + 0.00001 { - $9,888,922,110  + $10,886,822,210 Q  }
= $979,008,999 + $19,991,001 Q + 0.00001 { - $9,888,922,110  + $10,886,822,210 Q - $979,008,999 - $19,991,001 Q }
= $979,008,999 + $19,991,001 Q + 0.00001 { $9,907,813,211 + $10,866,831,209 Q }
= ( $979,008,999 + $99,078.13211) + ( $19,991,001 + $108,668.31209 ) Q
= $979,108,077 + $20,099,669 Q

 

Perhaps a meta strategy of 1% chance of two-boxing is not Fred's optimal meta strategy.  Perhaps, at that level compared to Omega's ability to discern, it is still worth Omega investing in being vindictive occasionally, in order to deter Fred from taking risk.   But, given sufficient data about previous games, Fred can make a guess at Omega's ability to discern.  And, likewise Omega, by including in the record of past games occasions when Omega has falsely accused a human player of taking risk, can signal to future players where Omega's boundaries are.   We can plot graphs of these to find the point at which Fred's meta strategy and Omega's meta strategy are in equilibrium - where if Fred took any larger chances, it would start becoming worth Omega's while to punish risk sufficiently often that it would no longer be in Fred's interests to take the risk.   Precisely where that point is will depend on the numbers we picked in Part 1 of this sequence.  By exploring the space created by using each variable number as a dimension, we can divide it into regions characterised by which strategies dominate within that region.

Extrapolating that as δ tends towards 0 should then carry us closer to a convincing solution to Newcomb's Problem.

 


 

  Back to Part 1 - stating the problem
  Back to Part 2 - some mathematics
  This is   Part 3 - towards a solution

A solvable Newcomb-like problem - part 2 of 3

0 Douglas_Reay 03 December 2012 04:49PM

This is the second part of a three post sequence on a problem that is similar to Newcomb's problem but is posed in terms of probabilities and limited knowledge.

   Part 1 - stating the problem
   Part 2 - some mathematics
   Part 3 - towards a solution

 


 

In game theory, a payoff matrix is a way of presenting the results of two players simultaneously picking options.

For example, in the Prisoner's Dilemma, Player A gets to choose between option A1 (Cooperate) and option A2 (Defect) while, at the same time Player B gets to choose between option B1 (Cooperate) and option B2 (Defect).   Since years spent in prison are a negative outcome, we'll write them as negative numbers:

payoff

So, if you look at the bottom right hand corner, at the intersection of Player A defecting (A2) and Player B defecting (B2) we see that both players end up spending 4 years in prison.   Whereas, looking at the bottom left we see that if A defects and B cooperates, then Player A ends up spending 0 years in prison and Player B ends up spending 5 years in prison.

Another familiar example we can present in this form is the game Rock-Paper-Scissors.

We could write it as a zero sum game, with a win being worth 1, a tie being worth 0 and a loss being worth -1:

But it doesn't change the mathematics if we give both players 2 points each round just for playing, so that a win becomes worth 3 points, a tie becomes worth 2 points and a loss becomes worth 1 point.  (Think of it as two players in a game show being rewarded by the host, rather than the players making a direct bet with each other.)

If you are Player A, and you are playing against a Player B who always chooses option B1 (Rock), then your strategy is clear.  You choose option A2 (Paper) each time.  Over 10 rounds, you'd expect to end up with $30 compared to B's $10.

Let's imagine a slightly more sophisticated Player B, who always picks Rock in the first round, and then for all other rounds picks whatever would beat Player A's choice the previous round.   This strategy would do well against someone who always picked the same option each round, but it is deterministic and, if we guess it correctly in advance, we can design a strategy that beats it every time.  (In this case, picking Paper-Rock-Scissors then repeating back to Paper).   In fact whatever strategy B comes up with, if that strategy is deterministic and we guess it in advance, then we end up with $30 and B ends up with $10.

What if B has a deterministic strategy that B picked in advance and doesn't change, but we don't know at the start of the first round what it is?   In theory B might have picked any of the 3-to-the-power-of-10 deterministic strategies that are indistinguishable from each other over a 10 round duel but, in practice, humans tend to favour some strategies over others so, if you know humans and the game of Rock-Paper-Scissors better than Player B does, you have a better than even chance of guessing his pattern and coming out ahead in the later rounds of the duel.

But there's a danger to that.  What if you have overestimated your comparative knowledge level and Player B uses your overconfidence to lure you into thinking you've cracked B's pattern, while really B is laying a trap, increasing the predictability of Player A's moves so Player B can then take advantage of that to work out which moves will trump them?  This works better in a game like poker, where the stakes are not the same each round, but it is still possible in Rock-Paper-Scissors, and you can imagine variants of the game where the host varies payoff matrix by increasing the lose-tie-win rewards from 1,2,3 in the first round, to 2,4,6 in the second round, 3,6,9 in the third round, and so on.

This is why the safest strategy is to not to have a deterministic strategy but, instead, use a source of random bits to each round pick option 1 with a probability of 33%, option 2 with a probability of 33% or option 3 with a probability of 33% (modulo rounding).  You might not get to take advantage of any predictability that becomes apparent in your opponents strategy, but neither can you be fooled into becoming predictable yourself.

On a side note, this still applies even when there is only one round, because unaided humans are not as good at coming up with random bits as they think they are.  Someone who has observed many first time players will notice that first time players more often than not choose as their Rock as their 'random' first move, rather than Paper or Scissors.  If such a person were confident that they were playing a first time player, they might therefore pick Paper as their first move more frequently than not.  Things soon get very Sicilian (in the sense of the duel between Westley and Vizzini in the film The Princess Bride) after that, because a yet more sophisticated player who guessed their opponent would try this, could then pick Scissors.  And so ad infinitum, with ever more implausible levels of discernment being required to react on the next level up.

We can imagine a tournament set up between 100 players taken randomly from the expertise distribution of game players, each player submitting a python program that always plays the same first move, and for each of the remaining 9 rounds produces a move determined solely by the the moves so far in that duel.  The tournament organiser would then run every player's program once against the programs of each of the other 99 players, so on average each player would collect 99x10x2 = $1,980

We could make things more complex by allowing the programs to use, as an input, how much money their opponent has won so far during the tournament; or iterate over running the tournament several times, to give each player an 'expertise' rating which the program in the following tournament could then use.  We could allow the tournament host to subtract from each player a sum of money depending upon the size of program that player submitted (and how much memory or cpu it used).   We could give each player a limited ration of random bits, so when facing a player with a higher expertise rating they might splurge and make their move on all 10 rounds completely random, and when facing a player with a lower expertise they might conserve their supply by trying to 'out think' them.

There are various directions we could take this, but the one I want to look at here is what happens when you make the payoff matrix asymmetric.  What happens if you make the game unfair, so not only does one player have more at stake than the other player, but the options are not even either, for example:

You still have the circular Rock-Paper-Scissors dynamic where:
   If B chose B3, then A wants most to have chosen A1
   If A chose A1, then B wants most to have chosen B2
   If B chose B2, then A wants most to have chosen A3
   If A chose A3, then B wants most to have chosen B1
   If B chose B1, then A wants most to have chosen A2
   If A chose A2, then B wants most to have chosen B3

so everything wins against at least one other option, and loses against at least one other option.   However Player B is clearly now in a better position, because B wins ties, and B's wins (a 9, an 8 and a 7) tend to be larger than A's wins (a 9, a 6 and a 6).

What should Player A do?  Is the optimal safe strategy still to pick each option with an equal weighting?

Well, it turns out the answer is: no, an equal weighting isn't the optimal response.   Neither is just picking the same 'best' option each time.  Instead what do you is pick your 'best' option a bit more frequently than an equal weighting would suggest, but not so much that the opponent can steal away that gain by reliably choosing the specific option that trumps yours.   Rather than duplicate material already well presented on the web, I will point you at two lecture courses on game theory that explain how to calculate the exact probability to assign to each option:

You do this by using the indifference theorem to arrive at a set of linear equations, which you can then solve to arrive at a mixed equilibrium where neither player increases their expected utility by altering the probability weightings they assign to their options.

 

The TL;DR; points to take away

If you are competing in what is effectively a simultaneous option choice game, with a being who you suspect may have an equal or higher expertise to you at the game, you can nullify their advantage by picking a strategy that, each round chooses randomly (using a weighting) between the available options.

Depending upon the details of the payoff matrix, there may be one option that it makes sense for you to pick most of the time but, unless that option is strictly better than all your other choices no matter what option your opponent picks, there is still utility to gain from occasionally picking the other options in order to keep your opponent on their toes.

 


 

  Back to Part 1 - stating the problem
  This is  Part 2 - some mathematics
  Next to Part 3 - towards a solution

A solvable Newcomb-like problem - part 1 of 3

1 Douglas_Reay 03 December 2012 09:26AM

This is the first part of a three post sequence on a problem that is similar to Newcomb's problem but is posed in terms of probabilities and limited knowledge.

   Part 1 - stating the problem
   Part 2 - some mathematics
   Part 3 - towards a solution

 


 

Omega is an AI, living in a society of AIs, who wishes to enhance his reputation in that society for being successfully able to predict human actions.  Given some exchange rate between money and reputation, you could think of that as a bet between him and another AI, let's call it Alpha.  And since there is also a human involved, for the sake of clarity, to avoid using "you" all the time, I'm going to sometimes refer to the human using the name "Fred".

 

Omega tells Fred:

I'd like you to pick between two options, and I'm going to try to predict which option you're going to pick.
    Option "one box" is to open only box A, and take any money inside it
    Option "two box" is to open both box A and box B, and take any money inside them

but, before you pick your option, declare it, then open the box or boxes, there are three things you need to know.

 

Firstly, you need to know the terms of my bet with Alpha.

If Fred picks option "one box" then:
   If box A contains $1,000,000 and box B contains $1,000 then Alpha pays Omega $1,000,000,000
   If box A contains $0              and box B contains $1,000 then Omega pays Alpha $10,000,000,000
   If anything else, then both Alpha and Omega pay Fred $1,000,000,000,000

If Fred picks option "two box" then:
   If box A contains $1,000,000 and box B contains $1,000 then Omega pays Alpha $10,000,000,000
   If box A contains $0              and box B contains $1,000 then Alpha pays Omega $1,000,000,000
   If anything else, then both Alpha and Omega pay Fred $1,000,000,000,000

 

Secondly, you should know that I've already placed all the money in the boxes that I'm going to, and I can't change the contents of the boxes between now and when you do the opening, because Alpha is monitoring everything.  I've already made my prediction, using a model I've constructed of your likely reactions based upon your past actions.

You can use any method you like to choose between the two options, short of contacting another AI, but be warned that if my model predicted that you'll use a method which introduces too large a random element (such as tossing a coin) then, while I may lose my bet with Alpha, I'll certainly have made sure you won't win the $1,000,000.  Similarly, if my model predicted that you'd make an outside bet with another human (let's call him George) to alter the value of winning $1,001,000 from me I'd have also taken that into account.  (I say "human" by the way, because my bet with Alpha is about my ability to predict humans so if you contact another AI, such as trying to lay a side bet with Alpha to skim some of his winnings, that invalidates not only my game with you, but also my bet with Alpha and there are no winning to skim.)

 

And, third and finally, you need to know my track record in previous similar situations.

I've played this game 3,924 times over the past 100 years (ie since the game started), with humans picked at random from the full variety of the population.   The outcomes were:
   3000 times players picked option "one box" and walked away with $1,000,000
   900  times players picked option "two box" and walked away with $1,000
   24 times players flipped a coin and or were otherwise too random.  Of those players:
        12 players picked option "one box" and walked away with $0
        12 players picked option "two box" and walked away with $1,000

Never has anyone ever ended up walking away with $1,001,000 by picking option "two box".

 

Omega stops talking.   You are standing in a room containing two boxes, labelled "A" and "B", which are both currently closed.  Everything Omega said matches what you expected him to say, as the conditions of the game are always the same and are well known - you've talked with other human players (who confirmed it is legit) and listened to their advice.   You've not contacted any AIs, though you have read the published statement from Alpha that also confirms the terms of the bet and details of the monitoring.  You've not made any bets with other humans, even though your dad did offer to bet you a bottle of whiskey that you'd be one of them too smart alecky fools who walked away with only $1,000.  You responded by pre-committing to keep any winnings you make between you and your banker, and to never let him know.

The only relevant physical object you've brought along is a radioactive decay based random number generator, that Omega would have been unable to predict the result of in advance, just in case you decide to use it as a factor in your choice.  It isn't a coin, giving only a 50% chance of "one box" and a 50% chance of "two box".   You can set arbitrary odds (tell it to generate a random integer between 0 and any positive integer you give it, up to 10 to the power of 100).   Omega said in his spiel the phrase "too large a random element" but didn't specify where that boundary was.

What do you do?   Or, given that such a situation doesn't exist yet, and we're talking about a Fred in a possible future, what advice would you give to Fred on how to choose, were he to ever end up in such a situation?

Pick "one box"?   Pick "two box"?   Or pick randomly between those two choices and, if so, at what odds?

And why?


 

         Part 1 - stating the problem
next   Part 2 - some mathematics
         Part 3 - towards a solution

Is Omega Impossible? Can we even ask?

-8 mwengler 24 October 2012 02:47PM

EDIT: I see by the karma bombing we can't even ask.  Why even call this part of the site "discussion?"  

 

Some of the classic questions about an omnipotent god include

 

  1. Can god make a square circle?
  2. Can god create an immovable object?  And then move it?
Saints and philosophers wrestled with these issues back before there was television.  My recollection is that people who liked the idea of an omnipotent god would answer "omnipotence does not include the power to do nonsense" where they would generally include contradictions as nonsense.  So omnipotence can't square a circle, can't make 2=3, can't make an atom which is simultaneously lead and gold.  

But where do the contradictions end and the merely difficult to conceive begin?  Can omnipotence make the ratio of the diameter to the circumference of a circle = 3, or 22/7?  Can omnipotence make sqrt(2)=1.4 or 2+2=5?  While these are not directly self-contradictory statements, they can be used with a variety of simple truths to quickly derive self-contradictory statements.  Can we then conclude that "2+2=5" is essentially a contradiction because it is close to a contradiction?  Where do we draw the line?  

What if were set some problem where we are told to assume that 
  1. 2+2 = 5
  2. 1+1 = 2
  3. 1+1+1+1+1 = 5
In solving this set problem, we can quickly derive that 1=0, and use that to prove effectively anything we want to prove.  Perhaps not formally, but we have violated the "law of the excluded middle," that either a statement is true or its negation is.  Once you violate that, you can prove ANYTHING using simple laws of inference, because you have propositions that are true and false.  

What if we set a problem where we are told to assume
  1. Omega is an infallible intelligence that does not lie
  2. Omega tells you 2+2=5
Well, we are going to have the same problem as above, we will be able to prove anything.

Newcomb's Problem

In Newcomb's box problem, we are told to assume that
  1. Omega is an infallible intelligence
  2. Omega has predicted correctly whether we will one box or two box.  
From these assumptions we wind up with all sorts of problems of causality and/or free will and/or determinism.  

What if these statements are not consistent?  What if these statements are tantamount to assuming 0=1, or are within a few steps of assuming 0=1?  Or something just as contradictory, but harder to identify?  

Personally, I can think of LOTS of reasons to doubt that Newcomb's problem is even theoretically possible to set.  Beyond that, I can think that the empirical barrier to believing Omega exists in reality would be gigantic, millions of humans have watched magic shows performed by non-superior intelligences where cards we have signed have turned up in a previously sealed envelope or wallet or audience member's pocket.  We recognize that these are tricks, that they are not what they appear.  

To question Omega is not playing by the mathematician's or philosopher's rules.  But when we play by the rules, do we blithely assume 2+2=5 and then wrap ourselves around the logical axle trying to program a friendly AI to one-box?  Why is questioning Omega's possibility of existence, or possibility of proof of existence out-of-bounds?  

 

Omega lies

7 Stuart_Armstrong 24 October 2012 10:46AM

Just developing my second idea at the end of my last post. It seems to me that in the Newcomb problem and in the counterfactual mugging, the completely trustworthy Omega lies to a greater or lesser extent.

This is immediately obvious in scenarios where Omega simulates you in order to predict your reaction. In the Newcomb problem, the simulated you is told "I have already made my decision...", which is not true at that point, and in the counterfactual mugging, whenever the coin comes up heads, the simulated you is told "the coin came up tails". And the arguments only go through because these lies are accepted by the simulated you as being true.

If Omega doesn't simulate you, but uses other methods to gauge your reactions, he isn't lying to you per se. But he is estimating your reaction in the hypothetical situation where you were fed untrue information that you believed to be true. And that you believed to be true, specifically because the source is Omega, and Omega is trustworthy.

Doesn't really change much to the arguments here, but it's a thought worth bearing in mind.

Naive TDT, Bayes nets, and counterfactual mugging

15 Stuart_Armstrong 23 October 2012 03:58PM

I set out to understand precisely why naive TDT (possibly) fails the counterfactual mugging problem. While doing this I ended up drawing a lot of Bayes nets, and seemed to gain some insight; I'll pass these on, in the hopes that they'll be useful. All errors are, of course, my own.

The grand old man of decision theory: the Newcomb problem

First let's look at the problem that inspired all this research: the Newcomb problem. In this problem, a supremely-insightful-and-entirely-honest superbeing called Omega presents two boxes to you, and tells you that you can either choose box A only ("1-box"), or take box A and box B ("2-box"). Box B will always contain $1K (one thousand dollars). Omega has predicted what your decision will be, though, and if you decided to 1-box, he's put $1M (one million dollars) in box A; otherwise he's put nothing in it. The problem can be cast as a Bayes net with the following nodes:

continue reading »

Can anyone explain to me why CDT two-boxes?

-12 Andreas_Giger 02 July 2012 06:06AM

I have read lots of LW posts on this topic, and everyone seems to take this for granted without giving a proper explanation. So if anyone could explain this to me, I would appreciate that.

This is a simple question that is in need of a simple answer. Please don't link to pages and pages of theorycrafting. Thank you.

 

Edit: Since posting this, I have come to the conclusion that CDT doesn't actually play Newcomb. Here's a disagreement with that statement:

If you write up a CDT algorithm and then put it into a Newcomb's problem simulator, it will do something. It's playing the game; maybe not well, but it's playing.

And here's my response:

The thing is, an actual Newcomb simulator can't possibly exist because Omega doesn't exist. There are tons of workarounds, like using coin tosses as a substitution for Omega and ignoring the results whenever the coin was wrong, but that is something fundamentally different from Newcomb.

You can only simulate Newcomb in theory, and it is perfectly possible to just not play a theoretical game, if you reject the theory it is based on. In theoretical Newcomb, CDT doesn't care about the rule of Omega being right, so CDT does not play Newcomb.

If you're trying to simulate Newcomb in reality by substituting Omega with someone who has only empirically been proven right, you substitute Newcomb with a problem that consists of little more than simple calculation of priors and payoffs, and that's hardly the point here.

 

Edit 2: Clarification regarding backwards causality, which seems to confuse people:

Newcomb assumes that Omega is omniscient, which more importantly means that the decision you make right now determines whether Omega has put money in the box or not. Obviously this is backwards causality, and therefore not possible in real life, which is why Nozick doesn't spend too much ink on this.

But if you rule out the possibility of backwards causality, Omega can only make his prediction of your decision based on all your actions up to the point where it has to decide whether to put money in the box or not. In that case, if you take two people who have so far always acted (decided) identical, but one will one-box while the other one will two-box, Omega cannot make different predictions for them. And no matter what prediction Omega makes, you don't want to be the one who one-boxes.

 

Edit 3: Further clarification on the possible problems that could be considered Newcomb:

There's four types of Newcomb problems:

  1. Omniscient Omega (backwards causality) - CDT rejects this case, which cannot exist in reality.
  2. Fallible Omega, but still backwards causality - CDT rejects this case, which cannot exist in reality.
  3. Infallible Omega, no backwards causality - CDT correctly two-boxes. To improve payouts, CDT would have to have decided differently in the past, which is not decision theory anymore.
  4. Fallible Omega, no backwards causality - CDT correctly two-boxes. To improve payouts, CDT would have to have decided differently in the past, which is not decision theory anymore.

That's all there is to it.

 

Edit 4: Excerpt from Nozick's "Newcomb's Problem and Two Principles of Choice":

Now, at last, to return to Newcomb's example of the predictor. If one believes, for this case, that there is backwards causality, that your choice causes the money to be there or not, that it causes him to have made the prediction that he made, then there is no problem. One takes only what is in the second box. Or if one believes that the way the predictor works is by looking into the future; he, in some sense, sees what you are doing, and hence is no more likely to be wrong about what you do than someone else who is standing there at the time and watching you, and would normally see you, say, open only one box, then there is no problem. You take only what is in the second box. But suppose we establish or take as given that there is no backwards causality, that what you actually decide to do does not affect what he did in the past, that what you actually decide to do is not part of the explanation of why he made the prediction he made. So let us agree that the predictor works as follows: He observes you sometime before you are faced with the choice, examines you with complicated apparatus, etc., and then uses his theory to predict on the basis of this state you were in, what choice you would make later when faced with the choice. Your deciding to do as you do is not part of the explanation of why he makes the prediction he does, though your being in a certain state earlier, is part of the explanation of why he makes the prediction he does, and why you decide as you do.

I believe that one should take what is in both boxes. I fear that the considerations I have adduced thus far will not convince those proponents of taking only what is in the second box. Furthermore I suspect that an adequate solution to this problem will go much deeper than I have yet gone or shall go in this paper. So I want to pose one question. I assume that it is clear that in the vaccine example, the person should not be convinced by the probability argument, and should choose the dominant action. I assume also that it is clear that in the case of the two brothers, the brother should not be convinced by the probability argument offered. The question I should like to put to proponents of taking only what is in the second box in Newcomb's example (and hence not performing the dominant action) is: what is the difference between Newcomb's example and the other two examples which make the difference between not following the dominance principle, and following it?

Extremely Counterfactual Mugging or: the gist of Transparent Newcomb

5 Bongo 09 February 2011 03:20PM

Omega will either award you $1000 or ask you to pay him $100. He will award you $1000 if he predicts you would pay him if he asked. He will ask you to pay him $100 if he predicts you wouldn't pay him if he asked. 

Omega asks you to pay him $100. Do you pay?

This problem is roughly isomorphic to the branch of Transparent Newcomb (version 1, version 2) where box B is empty, but it's simpler.

Here's a diagram:

Omega can be replaced by amnesia

15 Bongo 26 January 2011 12:31PM

Let's play a game. Two times, I will give you an amnesia drug and let you enter a room with two boxes inside. Because of the drug, you won't know whether this is the first time you've entered the room. On the first time, both boxes will be empty. On the second time, box A contains $1000, and Box B contains $1,000,000 iff this is the second time and you took only box B the first time. You're in the room, do take both boxes or only box B?

This is equivalent to Newcomb's Problem in the sense that any strategy does equally well on both, where by "strategy" I mean a mapping from info to (probability distributions over) actions.

I suspect that any problem with Omega can be transformed into an equivalent problem with amnesia instead of Omega.

Does CDT return the winning answer in such transformed problems?

Discuss.