Outlawing Anthropics: An Updateless Dilemma
Let us start with a (non-quantum) logical coinflip - say, look at the heretofore-unknown-to-us-personally 256th binary digit of pi, where the choice of binary digit is itself intended not to be random.
If the result of this logical coinflip is 1 (aka "heads"), we'll create 18 of you in green rooms and 2 of you in red rooms, and if the result is "tails" (0), we'll create 2 of you in green rooms and 18 of you in red rooms.
After going to sleep at the start of the experiment, you wake up in a green room.
With what degree of credence do you believe - what is your posterior probability - that the logical coin came up "heads"?
There are exactly two tenable answers that I can see, "50%" and "90%".
Suppose you reply 90%.
And suppose you also happen to be "altruistic" enough to care about what happens to all the copies of yourself. (If your current system cares about yourself and your future, but doesn't care about very similar xerox-siblings, then you will tend to self-modify to have future copies of yourself care about each other, as this maximizes your expectation of pleasant experience over future selves.)
Then I attempt to force a reflective inconsistency in your decision system, as follows:
I inform you that, after I look at the unknown binary digit of pi, I will ask all the copies of you in green rooms whether to pay $1 to every version of you in a green room and steal $3 from every version of you in a red room. If they all reply "Yes", I will do so.
Timeless Decision Theory and Meta-Circular Decision Theory
(This started as a reply to Gary Drescher's comment here in which he proposes a Metacircular Decision Theory (MCDT); but it got way too long so I turned it into an article, which also contains some amplifications on TDT which may be of general interest.)
Ingredients of Timeless Decision Theory
Followup to: Newcomb's Problem and Regret of Rationality, Towards a New Decision Theory
Wei Dai asked:
"Why didn't you mention earlier that your timeless decision theory mainly had to do with logical uncertainty? It would have saved people a lot of time trying to guess what you were talking about."
...
All right, fine, here's a fast summary of the most important ingredients that go into my "timeless decision theory". This isn't so much an explanation of TDT, as a list of starting ideas that you could use to recreate TDT given sufficient background knowledge. It seems to me that this sort of thing really takes a mini-book, but perhaps I shall be proven wrong.
The one-sentence version is: Choose as though controlling the logical output of the abstract computation you implement, including the output of all other instantiations and simulations of that computation.
The three-sentence version is: Factor your uncertainty over (impossible) possible worlds into a causal graph that includes nodes corresponding to the unknown outputs of known computations; condition on the known initial conditions of your decision computation to screen off factors influencing the decision-setup; compute the counterfactuals in your expected utility formula by surgery on the node representing the logical output of that computation.
Counterfactual Mugging v. Subjective Probability
This has been in my drafts folder for ages, but in light of Eliezer's post yesterday, I thought I'd see if I could get some comment on it:
A couple weeks ago, Vladimir Nesov stirred up the biggest hornet's nest I've ever seen on LW by introducing us to the Counterfactual Mugging scenario.
If you didn't read it the first time, please do -- I don't plan to attempt to summarize. Further, if you don't think you would give Omega the $100 in that situation, I'm afraid this article will mean next to nothing to you.
So, those still reading, you would give Omega the $100. You would do so because if someone told you about the problem now, you could do the expected utility calculation 0.5*U(-$100)+0.5*U(+$10000)>0. Ah, but where did the 0.5s come from in your calculation? Well, Omega told you he flipped a fair coin. Until he did, there existed a 0.5 probability of either outcome. Thus, for you, hearing about the problem, there is a 0.5 probability of your encountering the problem as stated, and a 0.5 probability of your encountering the corresponding situation, in which Omega either hands you $10000 or doesn't, based on his prediction. This is all very fine and rational.
So, new problem. Let's leave money out of it, and assume Omega hands you 1000 utilons in one case, and asks for them in the other -- exactly equal utility. What if there is an urn, and it contains either a red or a blue marble, and Omega looks, maybe gives you the utility if the marble is red, and asks for it if the marble is blue? What if you have devoted considerable time to determining whether the marble is red or blue, and your subjective probability has fluctuated over the course of you life? What if, unbeknownst to you, a rationalist community has been tracking evidence of the marble's color (including your own probability estimates), and running a prediction market, and Omega now shows you a plot of the prices over the past few years?
In short, what information do you use to calculate the probability you plug into the EU calculation?
Timeless Decision Theory: Problems I Can't Solve
Suppose you're out in the desert, running out of water, and soon to die - when someone in a motor vehicle drives up next to you. Furthermore, the driver of the motor vehicle is a perfectly selfish ideal game-theoretic agent, and even further, so are you; and what's more, the driver is Paul Ekman, who's really, really good at reading facial microexpressions. The driver says, "Well, I'll convey you to town if it's in my interest to do so - so will you give me $100 from an ATM when we reach town?"
Now of course you wish you could answer "Yes", but as an ideal game theorist yourself, you realize that, once you actually reach town, you'll have no further motive to pay off the driver. "Yes," you say. "You're lying," says the driver, and drives off leaving you to die.
If only you weren't so rational!
This is the dilemma of Parfit's Hitchhiker, and the above is the standard resolution according to mainstream philosophy's causal decision theory, which also two-boxes on Newcomb's Problem and defects in the Prisoner's Dilemma. Of course, any self-modifying agent who expects to face such problems - in general, or in particular - will soon self-modify into an agent that doesn't regret its "rationality" so much. So from the perspective of a self-modifying-AI-theorist, classical causal decision theory is a wash. And indeed I've worked out a theory, tentatively labeled "timeless decision theory", which covers these three Newcomblike problems and delivers a first-order answer that is already reflectively consistent, without need to explicitly consider such notions as "precommitment". Unfortunately this "timeless decision theory" would require a long sequence to write up, and it's not my current highest writing priority unless someone offers to let me do a PhD thesis on it.
However, there are some other timeless decision problems for which I do not possess a general theory.
For example, there's a problem introduced to me by Gary Drescher's marvelous Good and Real (OOPS: The below formulation was independently invented by Vladimir Nesov; Drescher's book actually contains a related dilemma in which box B is transparent, and only contains $1M if Omega predicts you will one-box whether B appears full or empty, and Omega has a 1% error rate) which runs as follows:
Suppose Omega (the same superagent from Newcomb's Problem, who is known to be honest about how it poses these sorts of dilemmas) comes to you and says:
"I just flipped a fair coin. I decided, before I flipped the coin, that if it came up heads, I would ask you for $1000. And if it came up tails, I would give you $1,000,000 if and only if I predicted that you would give me $1000 if the coin had come up heads. The coin came up heads - can I have $1000?"
View more: Prev
= 783df68a0f980790206b9ea87794c5b6)
Subscribe to RSS Feed
= f037147d6e6c911a85753b9abdedda8d)