Timeless Decision Theory: Problems I Can't Solve

27Eliezer_Yudkowsky20 July 2009 12:02AM

Suppose you're out in the desert, running out of water, and soon to die - when someone in a motor vehicle drives up next to you.  Furthermore, the driver of the motor vehicle is a perfectly selfish ideal game-theoretic agent, and even further, so are you; and what's more, the driver is Paul Ekman, who's really, really good at reading facial microexpressions.  The driver says, "Well, I'll convey you to town if it's in my interest to do so - so will you give me $100 from an ATM when we reach town?"

Now of course you wish you could answer "Yes", but as an ideal game theorist yourself, you realize that, once you actually reach town, you'll have no further motive to pay off the driver.  "Yes," you say.  "You're lying," says the driver, and drives off leaving you to die.

If only you weren't so rational!

This is the dilemma of Parfit's Hitchhiker, and the above is the standard resolution according to mainstream philosophy's causal decision theory, which also two-boxes on Newcomb's Problem and defects in the Prisoner's Dilemma.  Of course, any self-modifying agent who expects to face such problems - in general, or in particular - will soon self-modify into an agent that doesn't regret its "rationality" so much.  So from the perspective of a self-modifying-AI-theorist, classical causal decision theory is a wash.  And indeed I've worked out what seems like an elegant theory, tentatively labeled "timeless decision theory", which covers these three Newcomblike problems and delivers a first-order answer that is already reflectively consistent, without need to explicitly consider such notions as "precommitment".  Unfortunately this "timeless decision theory" would require a long sequence to write up, and it's not my current highest writing priority unless someone offers to let me do a PhD thesis on it.

However, there are some other timeless decision problems for which I do not possess a general theory.

For example, there's a problem introduced to me by Gary Drescher's marvelous Good and Real (OOPS: The below formulation was independently invented by Vladimir Nesov; Drescher's book actually contains a related dilemma in which box B is transparent, and only contains $1M if Omega predicts you will one-box whether B appears full or empty, and Omega has a 1% error rate) which runs as follows:

Suppose Omega (the same superagent from Newcomb's Problem, who is known to be honest about how it poses these sorts of dilemmas) comes to you and says:

"I just flipped a fair coin.  I decided, before I flipped the coin, that if it came up heads, I would ask you for $1000.  And if it came up tails, I would give you $1,000,000 if and only if I predicted that you would give me $1000 if the coin had come up heads.  The coin came up heads - can I have $1000?"

continue reading »

The Anthropic Trilemma

14Eliezer_Yudkowsky27 September 2009 01:47AM

Speaking of problems I don't know how to solve, here's one that's been gnawing at me for years.

The operation of splitting a subjective worldline seems obvious enough - the skeptical initiate can consider the Ebborians, creatures whose brains come in flat sheets and who can symmetrically divide down their thickness.  The more sophisticated need merely consider a sentient computer program: stop, copy, paste, start, and what was one person has now continued on in two places.  If one of your future selves will see red, and one of your future selves will see green, then (it seems) you should anticipate seeing red or green when you wake up with 50% probability.  That is, it's a known fact that different versions of you will see red, or alternatively green, and you should weight the two anticipated possibilities equally.  (Consider what happens when you're flipping a quantum coin: half your measure will continue into either branch, and subjective probability will follow quantum measure for unknown reasons.)

But if I make two copies of the same computer program, is there twice as much experience, or only the same experience?  Does someone who runs redundantly on three processors, get three times as much weight as someone who runs on one processor?

Let's suppose that three copies get three times as much experience.  (If not, then, in a Big universe, large enough that at least one copy of anything exists somewhere, you run into the Boltzmann Brain problem.)

Just as computer programs or brains can split, they ought to be able to merge.  If we imagine a version of the Ebborian species that computes digitally, so that the brains remain synchronized so long as they go on getting the same sensory inputs, then we ought to be able to put two brains back together along the thickness, after dividing them.  In the case of computer programs, we should be able to perform an operation where we compare each two bits in the program, and if they are the same, copy them, and if they are different, delete the whole program.  (This seems to establish an equal causal dependency of the final program on the two original programs that went into it.  E.g., if you test the causal dependency via counterfactuals, then disturbing any bit of the two originals, results in the final program being completely different (namely deleted).)

So here's a simple algorithm for winning the lottery:

continue reading »