Timeless Decision Theory and Meta-Circular Decision Theory

24 Eliezer_Yudkowsky 20 August 2009 10:07PM

(This started as a reply to Gary Drescher's comment here in which he proposes a Metacircular Decision Theory (MCDT); but it got way too long so I turned it into an article, which also contains some amplifications on TDT which may be of general interest.)

continue reading »

Ingredients of Timeless Decision Theory

43 Eliezer_Yudkowsky 19 August 2009 01:10AM

Followup toNewcomb's Problem and Regret of Rationality, Towards a New Decision Theory

Wei Dai asked:

"Why didn't you mention earlier that your timeless decision theory mainly had to do with logical uncertainty? It would have saved people a lot of time trying to guess what you were talking about."

...

All right, fine, here's a fast summary of the most important ingredients that go into my "timeless decision theory".  This isn't so much an explanation of TDT, as a list of starting ideas that you could use to recreate TDT given sufficient background knowledge.  It seems to me that this sort of thing really takes a mini-book, but perhaps I shall be proven wrong.

The one-sentence version is:  Choose as though controlling the logical output of the abstract computation you implement, including the output of all other instantiations and simulations of that computation.

The three-sentence version is:  Factor your uncertainty over (impossible) possible worlds into a causal graph that includes nodes corresponding to the unknown outputs of known computations; condition on the known initial conditions of your decision computation to screen off factors influencing the decision-setup; compute the counterfactuals in your expected utility formula by surgery on the node representing the logical output of that computation.

continue reading »

Timeless Decision Theory: Problems I Can't Solve

39 Eliezer_Yudkowsky 20 July 2009 12:02AM

Suppose you're out in the desert, running out of water, and soon to die - when someone in a motor vehicle drives up next to you.  Furthermore, the driver of the motor vehicle is a perfectly selfish ideal game-theoretic agent, and even further, so are you; and what's more, the driver is Paul Ekman, who's really, really good at reading facial microexpressions.  The driver says, "Well, I'll convey you to town if it's in my interest to do so - so will you give me $100 from an ATM when we reach town?"

Now of course you wish you could answer "Yes", but as an ideal game theorist yourself, you realize that, once you actually reach town, you'll have no further motive to pay off the driver.  "Yes," you say.  "You're lying," says the driver, and drives off leaving you to die.

If only you weren't so rational!

This is the dilemma of Parfit's Hitchhiker, and the above is the standard resolution according to mainstream philosophy's causal decision theory, which also two-boxes on Newcomb's Problem and defects in the Prisoner's Dilemma.  Of course, any self-modifying agent who expects to face such problems - in general, or in particular - will soon self-modify into an agent that doesn't regret its "rationality" so much.  So from the perspective of a self-modifying-AI-theorist, classical causal decision theory is a wash.  And indeed I've worked out a theory, tentatively labeled "timeless decision theory", which covers these three Newcomblike problems and delivers a first-order answer that is already reflectively consistent, without need to explicitly consider such notions as "precommitment".  Unfortunately this "timeless decision theory" would require a long sequence to write up, and it's not my current highest writing priority unless someone offers to let me do a PhD thesis on it.

However, there are some other timeless decision problems for which I do not possess a general theory.

For example, there's a problem introduced to me by Gary Drescher's marvelous Good and Real (OOPS: The below formulation was independently invented by Vladimir Nesov; Drescher's book actually contains a related dilemma in which box B is transparent, and only contains $1M if Omega predicts you will one-box whether B appears full or empty, and Omega has a 1% error rate) which runs as follows:

Suppose Omega (the same superagent from Newcomb's Problem, who is known to be honest about how it poses these sorts of dilemmas) comes to you and says:

"I just flipped a fair coin.  I decided, before I flipped the coin, that if it came up heads, I would ask you for $1000.  And if it came up tails, I would give you $1,000,000 if and only if I predicted that you would give me $1000 if the coin had come up heads.  The coin came up heads - can I have $1000?"

continue reading »