A solvable Newcomb-like problem - part 1 of 3

Douglas_Reay

This is the first part of a three post sequence on a problem that is similar to Newcomb's problem but is posed in terms of probabilities and limited knowledge.

   Part 1 - stating the problem
   Part 2 - some mathematics
   Part 3 - towards a solution

Omega is an AI, living in a society of AIs, who wishes to enhance his reputation in that society for being successfully able to predict human actions. Given some exchange rate between money and reputation, you could think of that as a bet between him and another AI, let's call it Alpha. And since there is also a human involved, for the sake of clarity, to avoid using "you" all the time, I'm going to sometimes refer to the human using the name "Fred".

Omega tells Fred:

I'd like you to pick between two options, and I'm going to try to predict which option you're going to pick.
Option "one box" is to open only box A, and take any money inside it
Option "two box" is to open both box A and box B, and take any money inside them

but, before you pick your option, declare it, then open the box or boxes, there are three things you need to know.

Firstly, you need to know the terms of my bet with Alpha.

If Fred picks option "one box" then:
   If box A contains $1,000,000 and box B contains $1,000 then Alpha pays Omega $1,000,000,000
   If box A contains $0              and box B contains $1,000 then Omega pays Alpha $10,000,000,000
   If anything else, then both Alpha and Omega pay Fred $1,000,000,000,000

If Fred picks option "two box" then:
   If box A contains $1,000,000 and box B contains $1,000 then Omega pays Alpha $10,000,000,000
   If box A contains $0              and box B contains $1,000 then Alpha pays Omega $1,000,000,000
   If anything else, then both Alpha and Omega pay Fred $1,000,000,000,000

Secondly, you should know that I've already placed all the money in the boxes that I'm going to, and I can't change the contents of the boxes between now and when you do the opening, because Alpha is monitoring everything. I've already made my prediction, using a model I've constructed of your likely reactions based upon your past actions.

You can use any method you like to choose between the two options, short of contacting another AI, but be warned that if my model predicted that you'll use a method which introduces too large a random element (such as tossing a coin) then, while I may lose my bet with Alpha, I'll certainly have made sure you won't win the $1,000,000. Similarly, if my model predicted that you'd make an outside bet with another human (let's call him George) to alter the value of winning $1,001,000 from me I'd have also taken that into account. (I say "human" by the way, because my bet with Alpha is about my ability to predict humans so if you contact another AI, such as trying to lay a side bet with Alpha to skim some of his winnings, that invalidates not only my game with you, but also my bet with Alpha and there are no winning to skim.)

And, third and finally, you need to know my track record in previous similar situations.

I've played this game 3,924 times over the past 100 years (ie since the game started), with humans picked at random from the full variety of the population.   The outcomes were:
   3000 times players picked option "one box" and walked away with $1,000,000
   900 times players picked option "two box" and walked away with $1,000
   24 times players flipped a coin and or were otherwise too random. Of those players:
        12 players picked option "one box" and walked away with $0
        12 players picked option "two box" and walked away with $1,000

Never has anyone ever ended up walking away with $1,001,000 by picking option "two box".

Omega stops talking. You are standing in a room containing two boxes, labelled "A" and "B", which are both currently closed. Everything Omega said matches what you expected him to say, as the conditions of the game are always the same and are well known - you've talked with other human players (who confirmed it is legit) and listened to their advice. You've not contacted any AIs, though you have read the published statement from Alpha that also confirms the terms of the bet and details of the monitoring. You've not made any bets with other humans, even though your dad did offer to bet you a bottle of whiskey that you'd be one of them too smart alecky fools who walked away with only $1,000. You responded by pre-committing to keep any winnings you make between you and your banker, and to never let him know.

The only relevant physical object you've brought along is a radioactive decay based random number generator, that Omega would have been unable to predict the result of in advance, just in case you decide to use it as a factor in your choice. It isn't a coin, giving only a 50% chance of "one box" and a 50% chance of "two box". You can set arbitrary odds (tell it to generate a random integer between 0 and any positive integer you give it, up to 10 to the power of 100). Omega said in his spiel the phrase "too large a random element" but didn't specify where that boundary was.

What do you do? Or, given that such a situation doesn't exist yet, and we're talking about a Fred in a possible future, what advice would you give to Fred on how to choose, were he to ever end up in such a situation?

Pick "one box"? Pick "two box"? Or pick randomly between those two choices and, if so, at what odds?

And why?

Part 1 - stating the problem
next Part 2 - some mathematics
Part 3 - towards a solution

One box, of course. Trying to outsmart an AI for a piddly little 0.1% increase in payoff is stupid.

Now if the payoff were reversed a player with high risk tolerance might reasonably go for some clever two-box solution... but the odds of success would be quite low, so one-boxing would still be the conservative strategy.

Omega's best strategy is to put money in both boxes if you one-box with probability >0.5. Omega's expected winnings is positive if you one-box with probability >10/11 (I think), but I don't presume to know enough about superintelligent AI psychology to know whether that's Omega's cutoff for changing strategy from "maximise money" to "screw you", so I'd just one-box.

Alternative solution: Offer Omega $1,100,000,000,000 to put nothing in the boxes.

Alternative solution: Offer Omega $1,100,000,000,000 to put nothing in the boxes.

Love it. But, no, that would be caught by Alpha's monitoring before things came to that point.

I was looking for an elegant way to say "These are the only possible options, ignore any change that there will be any other result (such as $1,000,000 in box B.". I was trying to give a context where the rules made sense. In terms of reputation being exchangeable for money, that represents both Alpha and Omega being immensely embarrassed among the AI society at screwing up the basic physical setup and invalidating their own contest. It is sort of a side issue. :-)

24 times players flipped a coin and or were otherwise too random. Of those players: 12 players picked option "one box" and walked away with $1,000

There are ways this is possible under the stated problem, but I'm not sure it is what you intend, so let me try to lay it out explicitly.

Here are what I see as the straightforward possibilities for a one-boxer:

0 dollars (If Omega put no Money into Box A) (Omega may need to Pay Alpha in this case, but Fred doesn't care about this.)

1,000,000 dollars (If Omega put Money into Box A)

At least 2,000,000,000,000 (If any other item occurred in box A)

So Fred getting 1,000 dollars one a one box is possible, but I don't see it as straightforward. Here's one way I think it might work:

Since another amount of money occurred, and Omega did say the following:

but be warned that if my model predicted that you'll use a method which introduces too large a random element (such as tossing a coin) then, while I may lose my bet with Alpha, I'll certainly have made sure you won't win the $1,000,000.

Omega may have put a "Fred, as a random chooser, you owe me 999,999,999,000 and Alpha 1,000,000,000,000" into Box A and leaves Box B empty if Omega predicts Fred will use a random element. (I'm assuming Alpha will not agree to a scenario where Omega can money pump Alpha without Omega even making a correct prediction.)

That would leave Fred with 1,000 dollars, since Omega would have violated the terms, and triggered the two 1,000,000,000,000 payments, which Omega can immediately counter with the above mentioned UOME.

So my piece of advice would be "Fred, ask one of the other Humans who randomed and one boxed because of the random result how they only got 1,000 dollars. That may or may not be exploitable, since it may have been the above, or it may have been something else. But assuming it isn't exploitable, (and it probably isn't, Omega probably knows you've talked to me and might be able to put even more negative money or even death in box A.) Pick one box, and take box A, because people who do that get more money then people who two box or people who random, and Omega has a vested interest in winning the bets with Alpha."

Thank you. Corrected.

Would the problem be different if instead of AI's, Alpha and Omega are human magicians? Or human professional skeptics?

If Omega gets it right more than 99% of the time, then why would Alpha take 10-to-1 odds against Omega messing up?

Would the problem be different if we changed the Alpha-Omega betting odds so that if Alpha wins 1 in 10000 times (which is still plausible given the data) then Alpha would make a profit?

Alpha would be running away with it based on the flippers who one-boxed.

It is a convenient way of representing reputation gain.

Think of Alpha as being the AI society at large, or a reporter detailed to watch and report on Omega, on behalf of the AI society.

So they're not actually doing an explicit bet. Rather it is Omega wanting to improve Omega's reputation at human-prediction within the AI society by doing the experiment. The more Omega does the experiment and makes correct predictions, the higher the others in AI society will rate Omega's expertise. Presumably it is valuable to Omega in some way to have a high reputation for doing this; maybe he wants to sell an algorithm or database, maybe he wants his advice about a human-related problem taken seriously without disclosing his actual source code and evidence - the reason isn't relevant.

From Fred's perspective, when it comes to understanding and planning for Omega's actions, the dynamics of the situation are close enough that, to a first approximation, it is as though Omega had made a bet with some Alpha.

What the odds of that bet are a variable input to the problem, which I shall talk about in part 3, but note for now that this isn't an arbitrage situation where someone else can step in and improve Omega's reputation for him in order to chisel a corner off the odds. The odds for Omega, from the perspective of Omega's knowledge of the situation, will be in Omega's favour of improving Omega's net reputation, else he wouldn't be running the experiment.

Interesting variation! What is your exigence for writing these posts?

One box, of course. Trying to outsmart an AI for a piddly little 0.1% increase in payoff is stupid.

Alternative solution: Offer Omega $1,100,000,000,000 to put nothing in the boxes.

Alternative solution: Offer Omega $1,100,000,000,000 to put nothing in the boxes.

Love it. But, no, that would be caught by Alpha's monitoring before things came to that point.

24 times players flipped a coin and or were otherwise too random. Of those players: 12 players picked option "one box" and walked away with $1,000

There are ways this is possible under the stated problem, but I'm not sure it is what you intend, so let me try to lay it out explicitly.

Here are what I see as the straightforward possibilities for a one-boxer:

0 dollars (If Omega put no Money into Box A) (Omega may need to Pay Alpha in this case, but Fred doesn't care about this.)

1,000,000 dollars (If Omega put Money into Box A)

At least 2,000,000,000,000 (If any other item occurred in box A)

So Fred getting 1,000 dollars one a one box is possible, but I don't see it as straightforward. Here's one way I think it might work:

Since another amount of money occurred, and Omega did say the following:

but be warned that if my model predicted that you'll use a method which introduces too large a random element (such as tossing a coin) then, while I may lose my bet with Alpha, I'll certainly have made sure you won't win the $1,000,000.

Thank you. Corrected.

Would the problem be different if instead of AI's, Alpha and Omega are human magicians? Or human professional skeptics?

If Omega gets it right more than 99% of the time, then why would Alpha take 10-to-1 odds against Omega messing up?

Would the problem be different if we changed the Alpha-Omega betting odds so that if Alpha wins 1 in 10000 times (which is still plausible given the data) then Alpha would make a profit?

Alpha would be running away with it based on the flippers who one-boxed.

It is a convenient way of representing reputation gain.

Think of Alpha as being the AI society at large, or a reporter detailed to watch and report on Omega, on behalf of the AI society.

Interesting variation! What is your exigence for writing these posts?