Steve_Rayhawk comments on Towards a New Decision Theory - Less Wrong

50 Post author: Wei_Dai 13 August 2009 05:31AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (142)

You are viewing a single comment's thread. Show more comments above.

Comment author: Steve_Rayhawk 19 August 2009 09:05:42AM *  4 points [-]

. . . do what my creator would want me to do. In other words, upon receiving input X, S computes the following: suppose S's creator had enough time and computing power to create a giant lookup table that contains an optimal output for every input S might encounter, what would the entry for X be? Return that as the output.

I don't know how to define what R "would want" or would think was "optimal".

What lookup table would R create? If R is a causal decision theorist, R might think: "If I were being counterfactually mugged and Omega's coin had come up heads, Omega would have already made its prediction about whether S would output 'give $100' on the input 'tails'. So, if I program S with the rule 'give $100 if tails', that won't cause Omega to give me $10000. And if the coin came up tails, that rule would lose me $100. So I will program S with the rule 'give $0 if tails'."

R's expected utility at the time of coding may be maximized by the rule "give $100 if tails", but R makes decisions by the conditional expected utilities given each of Omega's possible past predictions, weighted by R's prior beliefs about those predictions. R's conditional expected utilities are both maximized by the decision to program S to output "give $0".

Comment author: Wei_Dai 19 August 2009 10:19:14AM 1 point [-]

[I deleted my earlier reply, because I was still confused about your questions.]

If, according to R's decision theory, the most preferred choice involves programming S to output "give $0", then that is what S would do.

It might be easier to think of the ideal S as consisting of a giant lookup table created by R itself given infinite time and computing power. An actual S would try to approximate this ideal to the best of its abilities.

How should S decide, from its inputs, which R is the creator with the expected utility S's outputs should be optimal for? Is it the R in the world where Omega's coin came up heads, or the R in the world where Omega's coin came up tails?

R would encode its own decision theory, prior, utility function, and memory at the time of coding into S, and have S optimize for that R.

Comment author: Steve_Rayhawk 19 August 2009 11:27:11AM *  5 points [-]

Sorry. I wasn't trying to ask my questions as questions about how R would make decisions. I was asking questions to try to answer your question about the relationship between exceptionless and timeless decision-making, by pointing out dimensions of a map of ways for R to make decisions. For some of those ways, S would be "timeful" around R's beliefs or time of coding, and for some of those ways S would be less timeful.

I have an intuition that there is a version of reflective consistency which requires R to code S so that, if R was created by another agent Q, S would make decisions using Q's beliefs even if Q's beliefs were different from R's beliefs (or at least the beliefs that a Bayesian updater would have had in R's position), and even when S or R had uncertainty about which agent Q was. But I don't know how to formulate that intuition to something that could be proven true or false. (But ultimately, S has to be a creator of its own successor states, and S should use the same theory to describe its relation to its past selves as to describe its relation to R or Q. S's decisions should be invariant to the labeling or unlabeling of its past selves as "creators". These sequential creations are all part of the same computational process.)