A problem with Timeless Decision Theory (TDT)

Gary_Drescher

48 A problem with Timeless Decision Theory (TDT)

4th Feb 2010

4 min read

48

According to Ingredients of Timeless Decision Theory, when you set up a factored causal graph for TDT, "You treat your choice as determining the result of the logical computation, and hence all instantiations of that computation, and all instantiations of other computations dependent on that logical computation", where "the logical computation" refers to the TDT-prescribed argmax computation (call it C) that takes all your observations of the world (from which you can construct the factored causal graph) as input, and outputs an action in the present situation.

I asked Eliezer to clarify what it means for another logical computation D to be either the same as C, or "dependent on" C, for purposes of the TDT algorithm. Eliezer answered:

For D to depend on C means that if C has various logical outputs, we can infer new logical facts about D's logical output in at least some cases, relative to our current state of non-omniscient logical knowledge. A nice form of this is when supposing that C has a given exact logical output (not yet known to be impossible) enables us to infer D's exact logical output, and this is true for every possible logical output of C. Non-nice forms would be harder to handle in the decision theory but we might perhaps fall back on probability distributions over D.

I replied as follows (which Eliezer suggested I post here).

If that's what TDT means by the logical dependency between Platonic computations, then TDT may have a serious flaw.

Consider the following version of the transparent-boxes scenario. The predictor has an infallible simulator D that predicts whether I one-box here [EDIT: if I see $1M]. The predictor also has a module E that computes whether the ith digit of pi is zero, for some ridiculously large value of i that the predictor randomly selects. I'll be told the value of i, but the best I can do is assign an a priori probability of .1 that the specified digit is zero.

The predictor puts $1M in the large box iff (D xor E) is true. (And that's explained to me, of course.)

So let's say I'm confronted with this scenario, and I see $1M in the large box.

The flaw then is that E (as well as D) meets your criterion for "depending on" my decision computation C. I'm initially unsure what C and E output. But if C in fact one-boxes here, then I can infer that E outputs False (or else the large box has to be empty, which it isn't). Similarly, if C in fact two-boxes here, then I can infer that E outputs True. (Or equivalently, a third-party observer could soundly draw either of those inferences.)

So E does indeed "depend on" C, in the particular sense you've specified. Thus, if I happen to have a strong enough preference that E output True, then TDT (as currently formulated) will tell me to two-box for the sake of that goal. But that's the wrong decision, of course. In reality, I have no choice about the specified digit of pi.

What's going on, it seems to me, is that the kind of logical/Platonic "dependency" that TDT would need to invoke here is this: that E's output be counterfactually entailed by C's output (which it isn't, in this case [see footnote]), rather than (as you've specified) merely inferable from C's output (which indeed it is, in this case). That's bad news, because distinguishing what my action does or does not counterfactually entail (as opposed to what it implies, causes, gives evidence for, etc.) is the original full-blown problem that TDT's prescribed decision-computation is meant to solve. So it may turn out that in order to proceed with that very computation (specifically, in order to ascertain which other Platonic computations "depend on" the decision computation C), you already need to (somehow) know the answer that the computation is trying to provide.

--Gary

[footnote] Because if-counterfactually C were to two-box, then (contrary to fact) the large box would (probably) be empty, circumventing the inference about E.

[appendix] In this post, you write:

...reasoning under logical uncertainty using limited computing power... is another huge unsolved open problem of AI. Human mathematicians had this whole elaborate way of believing that the Taniyama Conjecture implied Fermat's Last Theorem at a time when they didn't know whether the Taniyama Conjecture was true or false; and we seem to treat this sort of implication in a rather different way than '2=1 implies FLT', even though the material implication is equally valid.

I don't follow that. The sense of implication in which mathematicians established that TC implies FLT (before knowing if TC was true) is precisely material/logical implication: they showed ~(TC & ~FLT). And similarly, we can prove ~(3SAT-in-P & ~(P=NP)), etc. There's no need here to construct (or magically conjure) a whole alternative inference system for reasoning under logical uncertainty.

So if the inference you speak of (when specifying what it means for D to "depend on" C) is the same kind as was used in establishing TC=>FLT, then it's just material implication, which (as argued above) leads TDT to give wrong answers. Or if we substitute counterfactual entailment for material implication, then TDT becomes circular (question-begging). Or if you have in mind some third alternative, I'm afraid I don't understand what it might be.

EDIT: The rules of the original transparent-boxes problem (as specified in Good and Real) are: the predictor conducts a simulation that tentatively presumes there will be $1M in the large box, and then puts $1M in the box (for real) iff that simulation showed one-boxing. Thus, if the large box turns out to be empty, there is no requirement for that to be predictive of the agent's choice under those circumstances. The present variant is the same, except that (D xor E) determines the $1M, instead of just D. (Sorry, I should have said this to begin with, instead of assuming it as background knowledge.)

Timeless Decision TheoryDecision theory

Frontpage

48

New Comment

Rendering 0/140 comments, sorted by

top scoring

(show more) Click to highlight new comments since: Today at 10:59 PM

Some comments are truncated due to high volume. (⌘F to expand all)Change truncation settings

Moderation Log

48 A problem with Timeless Decision Theory (TDT)

by Gary_Drescher

4th Feb 2010

4 min read

140

48

I asked Eliezer to clarify what it means for another logical computation D to be either the same as C, or "dependent on" C, for purposes of the TDT algorithm. Eliezer answered:

For D to depend on C means that if C has various logical outputs, we can infer new logical facts about D's logical output in at least some cases, relative to our current state of non-omniscient logical knowledge. A nice form of this is when supposing that C has a given exact logical output (not yet known to be impossible) enables us to infer D's exact logical output, and this is true for every possible logical output of C. Non-nice forms would be harder to handle in the decision theory but we might perhaps fall back on probability distributions over D.

I replied as follows (which Eliezer suggested I post here).

If that's what TDT means by the logical dependency between Platonic computations, then TDT may have a serious flaw.

The predictor puts $1M in the large box iff (D xor E) is true. (And that's explained to me, of course.)

So let's say I'm confronted with this scenario, and I see $1M in the large box.

--Gary

[footnote] Because if-counterfactually C were to two-box, then (contrary to fact) the large box would (probably) be empty, circumventing the inference about E.

[appendix] In this post, you write:

...reasoning under logical uncertainty using limited computing power... is another huge unsolved open problem of AI. Human mathematicians had this whole elaborate way of believing that the Taniyama Conjecture implied Fermat's Last Theorem at a time when they didn't know whether the Taniyama Conjecture was true or false; and we seem to treat this sort of implication in a rather different way than '2=1 implies FLT', even though the material implication is equally valid.

Timeless Decision TheoryDecision theory

Frontpage

48

Mentioned in

15Omega's subcontracting to Alpha

New Comment

Rendering 0/140 comments, sorted by

top scoring

(show more) Click to highlight new comments since: Today at 10:59 PM

Some comments are truncated due to high volume. (⌘F to expand all)Change truncation settings

Moderation Log

More from Gary_Drescher

Curated and popular this week

140Comments

140

Comment Permalink

Wei Dai16y00

First, to clear up a possible confusion, the S in my P is not supposed to be a variable. It's a constant, more specifically a piece of code that implements UDT1 itself. (If I sometimes talk about it as if it's a variable, that's because I'm trying to informally describe what is going on inside the computation that UDT1 does.)

For the more general question of how do we know the structure of the world program, the idea is that for an actual AI, we would program it to care about all possible world programs (or more generally, mathematical structures, see example 3 in my UDT1 post, but also Nesov's recent post for a critique). The implementation of UDT1 in the AI would then figure out which world programs it's in by looking at its inputs (which would contain all of the AI's memories and sensory data) and checking which world programs call it with those inputs.

For these sample problems, the assumption is that somehow Omega has previously provided us with enough evidence for us to trust its word on what the structure of the current problem is. So in the actual P, 'S(i, "box contains $1M")' is really something like 'S(memories, omegas_explanations_about_this_problem, i, "box contains $1M")' and these additional inputs allow S to conclude that it's being invoked inside this P, and not some other world program.

(An additional subtlety here is that if we consider all possible world programs, there are bound to be some other world programs where S is being called with these exact same inputs, for example ones where S is being instantiated inside a Boltzmann brain, but presumably those worlds/regions have very low weights, meaning that the AI doesn't care much about them.)

Let me know if that answers your questions/concerns. I didn't answer you point by point because I'm not sure which questions/concerns remain after you see my general answers. Feel free to repeat anything you still want me to answer.

jimrandomh16y00

First, to clear up a possible confusion, the S in my P is not supposed to be a variable. It's a constant, more specifically a piece of code that implements UDT1 itself. (If I sometimes talk about it as if it's a variable, that's because I'm trying to informally describe what is going on inside the computation that UDT1 does.)

Then it should be S(P), because S can't make any decisions without getting to read the problem description.

0Vladimir_Nesov16y

Note that since our agent is considering possible world-programs, these world-programs are in some sense already part of the agent's program (and the agent is in turn part of some of these world-programs-inside-the-agent, which reflects recursive character of the definition of the agent-program). The agent is a much better top-level program to consider than all-possible-world-programs, which is even more of a simplification if these world-programs somehow "exist at the same time". When the (prior) definition of the world is seen as already part of the agent, a lot of the ontological confusion goes away.

See in context