LESSWRONG
LW

Comment Permalink

1) Construct a full-blown DAG of math and Platonic facts, an account of which mathematical facts make other mathematical facts true, so that we can compute mathematical counterfactuals.

Although I know of no worked-out theory that I find convincing, I believe that counterfactual inference (of the sort that's appropriate to use in the decision computation) makes sense with regard to events in universes characterized by certain kinds of physical laws. But when you speak of mathematical counterfactuals more generally, it's not clear to me that that's even coherent.

Plus, if you did have a general math-counterfactual-solving module, why would you relegate it to the logical-dependency-finding subproblem in TDT, and then return to the original factored causal graph? Instead, why not cast the whole problem as a mathematical abstraction, and then directly ask your math-counterfactual-solving module whether, say, (Platonic) C's one-boxing counterfactually entails (Platonic) $1M? (Then do the argmax over the respective math-counterfactual consequences of C's candidate outputs.)

Wei Dai15y20

I've been reviewing some of this discussion, and noticed that Eliezer hasn't answered the question in your last paragraph. Here is his answer to one of my questions, which is similar to yours. But I'm afraid I still don't have a really good understanding of the answer. In other words, I'm still not really sure why we need all the extra machinery in TDT, when having a general math-counterfactual-solving module (what I called "mathematical intuition module") seems both necessary and sufficient.

I wonder if you, or anyone else, understands this well ... (read more)

1Wei Dai15y

This is basically the approach I took in (what I now call) UDT1.

See in context

48 A problem with Timeless Decision Theory (TDT)

by Gary_Drescher

4th Feb 2010

4 min read

140

48

According to Ingredients of Timeless Decision Theory, when you set up a factored causal graph for TDT, "You treat your choice as determining the result of the logical computation, and hence all instantiations of that computation, and all instantiations of other computations dependent on that logical computation", where "the logical computation" refers to the TDT-prescribed argmax computation (call it C) that takes all your observations of the world (from which you can construct the factored causal graph) as input, and outputs an action in the present situation.

I asked Eliezer to clarify what it means for another logical computation D to be either the same as C, or "dependent on" C, for purposes of the TDT algorithm. Eliezer answered:

For D to depend on C means that if C has various logical outputs, we can infer new logical facts about D's logical output in at least some cases, relative to our current state of non-omniscient logical knowledge. A nice form of this is when supposing that C has a given exact logical output (not yet known to be impossible) enables us to infer D's exact logical output, and this is true for every possible logical output of C. Non-nice forms would be harder to handle in the decision theory but we might perhaps fall back on probability distributions over D.

I replied as follows (which Eliezer suggested I post here).

If that's what TDT means by the logical dependency between Platonic computations, then TDT may have a serious flaw.

Consider the following version of the transparent-boxes scenario. The predictor has an infallible simulator D that predicts whether I one-box here [EDIT: if I see $1M]. The predictor also has a module E that computes whether the ith digit of pi is zero, for some ridiculously large value of i that the predictor randomly selects. I'll be told the value of i, but the best I can do is assign an a priori probability of .1 that the specified digit is zero.

The predictor puts $1M in the large box iff (D xor E) is true. (And that's explained to me, of course.)

So let's say I'm confronted with this scenario, and I see $1M in the large box.

The flaw then is that E (as well as D) meets your criterion for "depending on" my decision computation C. I'm initially unsure what C and E output. But if C in fact one-boxes here, then I can infer that E outputs False (or else the large box has to be empty, which it isn't). Similarly, if C in fact two-boxes here, then I can infer that E outputs True. (Or equivalently, a third-party observer could soundly draw either of those inferences.)

So E does indeed "depend on" C, in the particular sense you've specified. Thus, if I happen to have a strong enough preference that E output True, then TDT (as currently formulated) will tell me to two-box for the sake of that goal. But that's the wrong decision, of course. In reality, I have no choice about the specified digit of pi.

What's going on, it seems to me, is that the kind of logical/Platonic "dependency" that TDT would need to invoke here is this: that E's output be counterfactually entailed by C's output (which it isn't, in this case [see footnote]), rather than (as you've specified) merely inferable from C's output (which indeed it is, in this case). That's bad news, because distinguishing what my action does or does not counterfactually entail (as opposed to what it implies, causes, gives evidence for, etc.) is the original full-blown problem that TDT's prescribed decision-computation is meant to solve. So it may turn out that in order to proceed with that very computation (specifically, in order to ascertain which other Platonic computations "depend on" the decision computation C), you already need to (somehow) know the answer that the computation is trying to provide.

--Gary

[footnote] Because if-counterfactually C were to two-box, then (contrary to fact) the large box would (probably) be empty, circumventing the inference about E.

[appendix] In this post, you write:

...reasoning under logical uncertainty using limited computing power... is another huge unsolved open problem of AI. Human mathematicians had this whole elaborate way of believing that the Taniyama Conjecture implied Fermat's Last Theorem at a time when they didn't know whether the Taniyama Conjecture was true or false; and we seem to treat this sort of implication in a rather different way than '2=1 implies FLT', even though the material implication is equally valid.

I don't follow that. The sense of implication in which mathematicians established that TC implies FLT (before knowing if TC was true) is precisely material/logical implication: they showed ~(TC & ~FLT). And similarly, we can prove ~(3SAT-in-P & ~(P=NP)), etc. There's no need here to construct (or magically conjure) a whole alternative inference system for reasoning under logical uncertainty.

So if the inference you speak of (when specifying what it means for D to "depend on" C) is the same kind as was used in establishing TC=>FLT, then it's just material implication, which (as argued above) leads TDT to give wrong answers. Or if we substitute counterfactual entailment for material implication, then TDT becomes circular (question-begging). Or if you have in mind some third alternative, I'm afraid I don't understand what it might be.

EDIT: The rules of the original transparent-boxes problem (as specified in Good and Real) are: the predictor conducts a simulation that tentatively presumes there will be $1M in the large box, and then puts $1M in the box (for real) iff that simulation showed one-boxing. Thus, if the large box turns out to be empty, there is no requirement for that to be predictive of the agent's choice under those circumstances. The present variant is the same, except that (D xor E) determines the $1M, instead of just D. (Sorry, I should have said this to begin with, instead of assuming it as background knowledge.)

Timeless Decision TheoryDecision theory

Frontpage

48

Mentioned in

15Omega's subcontracting to Alpha

A problem with Timeless Decision Theory (TDT)

New Comment

140 comments, sorted by

top scoring

Click to highlight new comments since: Today at 12:39 PM

Some comments are truncated due to high volume. (⌘F to expand all)Change truncation settings

[-]Eliezer Yudkowsky15y90

And this was my reply:

This is an unfinished part of the theory that I've also thought about, though your example puts it very crisply (you might consider posting it to LW?)

My current thoughts on resolution tend to see two main avenues:

1) Construct a full-blown DAG of math and Platonic facts, an account of which mathematical facts make other mathematical facts true, so that we can compute mathematical counterfactuals.

2) Treat differently mathematical knowledge that we learn by genuinely mathematical reasoning and by physical observation. In this case we... (read more)

5Gary_Drescher15y

Perhaps I'm misunderstanding you here, but D and E are Platonic computations. What does it mean to construct a causal DAG among Platonic computations? [EDIT: Ok, I may understand that a little better now; see my edit to my reply to (1).] Such a graph links together general mathematical facts, so the same issues arise as in (1), it seems to me: Do the links correspond to logical inference, or something else? What makes the graph acyclic? Is mathematical causality even coherent? And if you did have a module that can detect (presumably timeless) causal links among Platonic computations, then why not use that module directly to solve your decision problems? Plus I'm not convinced that there's a meaningful distinction between math knowledge that you gain by genuine math reasoning, and math knowledge that you gain by physical observation. Let's say, for instance, that I feed a particular conjecture to an automatic theorem prover, which tells me it's true. Have I then learned that math fact by genuine mathematical reasoning (performed by the physical computer's Platonic abstraction)? Or have I learned it by physical observation (of the physical computer's output), and hence be barred from using that math fact for purposes of TDT's logical-dependency-detection? Presumably the former, right? (Or else TDT will make even worse errors.) But then suppose the predictor has simulated the universe sufficiently to establish that U (the universe's algorithm, including physics and initial conditions) leads to there being $1M in the box in this situation. That's a mathematical fact about U, obtained by (the simulator's) mathematical reasoning. Let's suppose that when the predictor briefs me, the briefing includes mention of this mathematical fact. So even if I keep my eyes closed and never physically see the $1M, I can rely instead on the corresponding mathematically derived fact. (Or more straightforwardly, we can view the universe itself as a computer that's performing mathematic

[-]Eliezer Yudkowsky15y130

Logical uncertainty has always been more difficult to deal with than physical uncertainty; the problem with logical uncertainty is that if you analyze it enough, it goes away. I've never seen any really good treatment of logical uncertainty.

But if we depart from TDT for a moment, then it does seem clear that we need to have causelike nodes corresponding to logical uncertainty in a DAG which describes our probability distribution. There is no other way you can completely observe the state of a calculator sent to Mars and a calculator sent to Venus, and yet remain uncertain of their outcomes yet believe the outcomes are correlated. And if you talk about error-prone calculators, two of which say 17 and one of which says 18, and you deduce that the "Platonic answer" was probably in fact 17, you can see that logical uncertainty behaves in an even more causelike way than this.

So, going back to TDT, my hope is that there's a neat set of rules for factoring our logical uncertainty in our causal beliefs, and that these same rules also resolve the sort of situation that you describe.

If you consider the notion of the correlated error-prone calculators, two returning 17 and one re... (read more)

4thomblake15y

When you use terms like "draw a hard causal boundary" I'm forced to imagine you're actually drawing these things on the back of a cocktail napkin somewhere using some sorts of standard symbols. Are there such standards, and do you have such diagrams scanned in online somewhere? ETA: A note for future readers: Eliezer below is referring to Judea Pearl (simply "Pearl" doesn't convey much via google-searching, though I suppose "pearl causality" does at the moment)

2Eliezer Yudkowsky15y

Read Pearl. I think his online intros should give you a good idea of what the cocktail napkin looks like.

4thomblake15y

Hmm... Pearl uses a lot of diagrams but they all seem pretty ad-hoc. Just the sorts of arrows and dots and things that you'd use to represent any graph (in the mathematics sense). Should I infer from this description that the answer is, "No, there isn't a standard"? I was picturing something like a legend that would tell someone, "Use a dashed line for a causal boundary, and a red dotted line to represent a logical inference, and a pink squirrel to represent postmodernism"

6Eliezer Yudkowsky15y

Um... I'm not sure there's much I can say to that beyond "Read Probabilistic Reasoning in Intelligent Systems, or Causality". Pearl's system is not ad-hoc. It is very not ad-hoc. It has a metric fuckload of math backing up the simple rules. But Pearl's system does not include logical uncertainty. I'm trying to put logical uncertainty into it, while obeying the rules. This is a work in progress.

8Alicorn15y

I'd just like to register a general approval of specifying that one's imaginary units are metric.

3bgrah44915y

FWIW

7wedrifid15y

Thomblake's observation may be that while Pearl's system is extremely rigorous the diagrams used do not give an authoritative standard style for diagram drawing.

4thomblake15y

That's correct - I was looking for a standard style for diagram drawing.

3cousin_it13y

I'm rereading past discussions to find insights. This jumped out at me: Do you still believe this?

6Vladimir_Nesov13y

Playing chicken with Omega may result in you becoming counterfactual.

2cousin_it13y

Why is cooperation more likely to qualify as "playing chicken" than defection here?

9Vladimir_Nesov13y

I was referring to the example Eliezer gives with your opponent being a DefectBot, in which case cooperating makes Omega's claim false, which may just mean that you'd make your branch of the thought experiment counterfactual, instead of convincing DefectBot to cooperate:

4cousin_it13y

So? That doesn't hurt my utility in reality. I would cooperate because that wins if agent X is correlated with me, and doesn't lose otherwise.

3Vladimir_Nesov13y

Winning is about how alternatives you choose between compare. By cooperating against a same-action DefectBot, you are choosing nonexistence over a (D,D), which is not obviously a neutral choice.

3FAWS13y

I don't think this is how it works. Particular counterfactual instances of you can't influence whether they are counterfactual or exist in some stronger sense. They can only choose whether there are more real instances with identical experiences (and their choices can sometimes acausally influence what happens with real instances, which doesn't seem to be the case here since the real you will choose defect either way as predicted by Omega). Hypothetical instances don't lose anything by being in the branch that chooses the opposite of what the real you chooses unless they value being identical to the real you, which IMO would be silly.

3Vladimir_Nesov13y

What can influence things like that? Whatever property of a situation can mark it as counterfactual (more precisely, given by a contradictory specification, or not following from a preceding construction, assumed-real past state for example), that property could as well be a decision made by an agent present in that situation. There is nothing special about agents or their decisions.

3FAWS13y

Why do you think something can influence it? Whether you choose to cooperate or defect, you can always ask both "what would happen if I cooperated?" and "what would happen if I defected?". In as far as being counterfactual makes sense the alternative to being the answer to "what would happen if I cooperated?" is being the answer to "what would happen if I defected?", even if you know that the real you defects. Compare Omega telling you that your answer will be the the same as the Nth digit of Pi. That doesn't you allow to choose the Nth digit of Pi.

0[anonymous]13y

This becomes a (relatively) straightforward matter of working out where the (potentially counterfactual - depending what you choose) calculation is being performed to determine exactly what this 'nonexistence' means. Since this particular thought experiment doesn't seem to specify any other broader context I assert that cooperate is clearly the correct option. Any agent which doesn't cooperate is broken. Basically, if you ever find yourself in this situation then you don't matter. It's your job to play chicken with the universe and not exist so the actual you can win.

4Eliezer Yudkowsky13y

Agent X is a piece of paper with "Defect" written on it. I defect against it. Omega's claim is true and does not imply that I should cooperate.

3ArisKatsaris13y

I don't see this argument making sense. Omega's claim reduces to neglibible chances that a choice of Defection will be advantageous for me, because Omega's claim makes it of neglible probability that either (D,C) or (C, D) will be realized. So I can only choose between the worlds of (C, C) and (D, D). Which means that the Cooperation world is advantageous, and that I should Cooperate. In contrast, if Omega had claimed that we'd make the opposite decisions, then I'd only have to choose between the worlds of (D, C) or (C, D) -- with the worlds of (C, C) and (D, D) now having negligible probability. In which case, I should, of course, Defect. The reasons for the correlation between me and Agent X are irrelevant when the fact of their correlation is known.

2cousin_it13y

Sorry, was this intended as part of the problem statement, like "Omega tells you that agent X is a DefectBot that will play the same as you"? If yes, then ok. But if we don't know what agent X is, then I don't understand why a DefectBot is apriori more probable than a CooperateBot. If they are equally probable, then it cancels out (edit: no it doesn't, it actually makes cooperating a better choice, thx ArisKatsaris). And there's also the case where X is a copy of you, where cooperating does help. So it seems to be a better choice overall.

1Vladimir_Nesov13y

There is also a case where X is an anticopy (performs opposite action), which argues for defecting in the same manner. Edit: This reply is wrong.

3cousin_it13y

No it doesn't. If X is an anticopy, the situation can't be real and your action doesn't matter.

1Vladimir_Nesov13y

Why can't it be real?

3cousin_it13y

Because Omega has told you that X's action is the same as yours.

2Vladimir_Nesov13y

OK.

3Gary_Drescher15y

I agree this sounds intuitive. As I mentioned earlier, though, nailing this down is tantamount to circling back and solving the full-blown problem of (decision-supporting) counterfactual reasoning: the problem of how to distinguish which facts to “hold fixed”, and which to “let vary” for consistency with a counterfactual antecedent. In any event, is the idea to try to build a separate graph for math facts, and use that to analyze “logical dependency” among the Platonic nodes in the original graph, in order to carry out TDT's modified “surgical alteration” of the original graph? Or would you try to build one big graph that encompasses physical and logical facts alike, and then use Pearl's decision procedure without further modification? Wait, isn't it decision-computation C—rather than simulation D—whose “effect” (in the sense of logical consequence) on E we're concerned about here? It's the logical dependents of C that get surgically altered in the graph when C gets surgically altered, right? (I know C and D are logically equivalent, but you're talking about inserting a physical node after D, not C, so I'm a bit confused.) I'm having trouble following the gist of avenue (2) at the moment. Even with the node structure you suggest, we can still infer E from C and from the physical node that matches (D xor E)—unless the new rule prohibits relying on that physical node, which I guess is the idea. But what exactly is the prohibition? Are we forbidden to infer any mathematical fact from any physical indicator of that fact? Or is there something in particular about node (D xor E) that makes it forbidden? (It would be circular to cite the node's dependence on C in the very sense of "dependence" that the new rule is helping us to compute.)

1Eliezer Yudkowsky15y

I definitely want one big graph if I can get it. Sorry, yes, C. No, but whenever we see a physical fact F that depends on a decision C/D we're still in the process of making plus Something Else (E), then we express our uncertainty in the form of a causal graph with directed arrows from C to D, D to F, and E to F. Thus when we compute a counterfactual on C, we find that F changes, but E does not.

2Gary_Drescher15y

Wait, F depends on decision computation C in what sense of “depends on”? It can't quite be the originally defined sense (quoted from your email near the top of the OP), since that defines dependency between Platonic computations, not between a Platonic computation and a physical fact. Do you mean that D depends on C in the original sense, and F in turn depends on D (and on E) in a different sense? Ok, but these arrows can't be used to define the relevant sense of dependency above, since the relevant sense of dependency is what tells us we need to draw the arrows that way, if I understand correctly. Sorry to keep being pedantic about the meaning of “depends”; I know you're in thinking-out-loud mode here. But the theory gives wildly different answers depending (heh) on how that gets pinned down.

1Eliezer Yudkowsky15y

In my view, the chief form of "dependence" that needs to be discriminated is inferential dependence and causal dependence. If earthquakes cause burglar alarms to go off, then we can infer an earthquake from a burglar alarm or infer a burglar alarm from an earthquake. Logical reasoning doesn't have the kind of directionality that causation does - or at least, classical logical reasoning does not - there's no preferred form between ~A->B, ~B->A, and A \/ B. The link between the Platonic decision C and the physical decision D might be different from the link between the physical decision D and the physical observation F, but I don't know of anything in the current theory that calls for treating them differently. They're just directional causal links. On the other hand, if C mathematically implies a decision C-2 somewhere else, that's a logical implication that ought to symmetrically run backward to ~C-2 -> ~C, except of course that we're presumably controlling/evaluating C rather than C-2. Thinking out loud here, the view is that your mathematical uncertainty ought to be in one place, and your physical uncertainty should be built on top of your mathematical uncertainty. The mathematical uncertainty is a logical graph with symmetric inferences, the physical uncertainty is a directed acyclic graph. To form controlling counterfactuals, you update the mathematical uncertainty, including any logical inferences that take place in mathland, and watch it propagate downward into the physical uncertainty. When you've already observed facts that physically depend on mathematical decisions you control but you haven't yet made and hence whose values you don't know, then those observations stay in the causal, directed, acyclic world; when the counterfactual gets evaluated, they get updated in the Pearl, directional way, not the logical, symmetrical inferential way.

2Gary_Drescher15y

No, D was the Platonic simulator. That's why the nature of the C->D dependency is crucial here.

4Eliezer Yudkowsky15y

Okay, then we have a logical link from C-platonic to D-platonic, and causal links descending from C-platonic to C-physical, E-platonic to E-physical, and D-platonic to D-physical to F-physical = D-physical xor E-physical. The idea being that when we counterfactualize on C-platonic, we update D-platonic and its descendents, but not E-platonic or its descendents. I suppose that as written, this requires a rule, "for purposes of computing counterfactuals, keep in the causal graph rather than the logical knowledge base, any mathematical knowledge gained by observing a fact descended from your decision-output or any logical implications of your decision-output". I could hope that this is a special case of something more elegant, but it would only be hope.

2Gary_Drescher15y

Ok. I think it would be very helpful to sketch, all in one place, what TDT2 (i.e., the envisioned avenue-2 version of TDT) looks like, taking care to pin down any needed sense of "dependency". And similarly for TDT1, the avenue-1 version. (These suggestions may be premature, I realize.)

0lessdazed14y

If X isn't like us, we can't "control" X by making a decision similar to what we would want X to output*. We shouldn't go from being an agent that defects in the prisoner's dilemma with Agent X when told we "make the same decision in the Prisoner's Dilemma as Agent X" to being one that does not defect, just as we do not unilaterally switch from natural to precision bidding when in contract bridge a partner opens with two clubs (which signals a good hand under precision bidding, and not under natural bidding). However, there do exist agents who should cooperate every time they hear they "make the same decision in the Prisoner's Dilemma as Agent X", those who have committed to cooperate in such cases. In some such cases, they are up against pieces of paper on which "cooperate" is written (too bad they didn't have a more discriminating algorithm/clear Omega), in others, they are up against copies of themselves or other agents whose output depends on what Omega tells them. In any case, many agents should cooperate when they hear that. Yes? No? Why shouldn't one be such an agent? Do we know ahead of time that we are likely to be up against pieces of paper with "cooperate" on them, and Omega would tell unhelpfully tell us we "make the same decision in the Prisoner's Dilemma as Agent X" in all such cases, though if we had a different strategy we could have gotten useful information and defected in that case? *Other cases include us defecting to get X to cooperate, and others where X's play depends on ours, but this is the natural case to use when considering if the Agent X's action depends on ours, a not strategically incompetent Agent X that has a strategy at least as good as always defecting or cooperating and does not try to condition his cooperating on our defecting or the like.

5Gary_Drescher15y

“Makes true” means logically implies? Why would that graph be acyclic? [EDIT: Wait, maybe I see what you mean. If you take a pdf of your beliefs about various mathematical facts, and run Pearl's algorithm, you should be able to construct an acyclic graph.] Although I know of no worked-out theory that I find convincing, I believe that counterfactual inference (of the sort that's appropriate to use in the decision computation) makes sense with regard to events in universes characterized by certain kinds of physical laws. But when you speak of mathematical counterfactuals more generally, it's not clear to me that that's even coherent. Plus, if you did have a general math-counterfactual-solving module, why would you relegate it to the logical-dependency-finding subproblem in TDT, and then return to the original factored causal graph? Instead, why not cast the whole problem as a mathematical abstraction, and then directly ask your math-counterfactual-solving module whether, say, (Platonic) C's one-boxing counterfactually entails (Platonic) $1M? (Then do the argmax over the respective math-counterfactual consequences of C's candidate outputs.)

2Wei Dai15y

1Wei Dai15y

This is basically the approach I took in (what I now call) UDT1.

4Gary_Drescher15y

For now, let me just reply to your incidental concluding point, because that's brief. I disagree that the red/green problem is unsolvable. I'd say the solution is that, with respect to the available information, both choices have equal (low) utility, so it's simply a toss-up. A correct decision algorithm will just flip a coin or whatever. Having done so, will a correct decision algorithm try to revise its choice in light of its (tentative) new knowledge of what its choice is? Only if it has nothing more productive to do with its remaining time.

3Psy-Kosh15y

Actually, one can do even better than that. As (I think), Eliezer implied, the key is Omega saying those words. (about the simulated you getting it wrong) Did the simulated version receive that message too? (if yes, and if we assume Omega is always truthful, this implies an infinite recursion of simulations... let us not go invoking infinite nested computations willy-nilly.) If there was only a single layer of simulation, them Omega either gave that statement as input to it or did not. If yes, Omega is untruthful, which throws pretty much all of the standard reasoning about Omega out the window and we can simply take into account the possibility that Omega is blatantly lying. If Omega is truthful, even to the simulations, then the simulation would not have received that prefix message. In which case you are in a different state than simulated you was. So all you have to do is make the decision opposite to what you would have done if you hadn't heard that particular extra message. This may be guessed by simply one iteration of "I automatically want to guess color1... but wait, simulated me got it wrong, so I'll guess color2 instead" since "actual" you has the knowledge that the previous version of you got it wrong. If Omega lies to simulations and tells truth to "actuals" (and can somehow simulate without the simulation being conscious, so there's no ambiguity about which you are, yet still be accurate... (am skeptical but confused on that point)), then we have an issue. But then it would require Omega to take a risk: if when telling the lie to the simulation, the simulation then gets it right, then what does Omega tell "actual" you? ("actual" in quotes because I honestly don't know whether or not one could be modeled with sufficient accuracy, however indirectly, without the model being conscious. I'm actually kind of skeptical of the prospect of a perfectly accurate model not being conscious, although a model that can determine some properties/approximations of

4Eliezer Yudkowsky15y

Omega can use the following algorithm: "Simulate telling the human that they got the answer wrong. If in this case they get the answer wrong, actually tell them that they get the answer wrong. Otherwise say nothing." This ought to make it relatively easy for Omega to truthfully put you in a "you're screwed" situation a fair amount of the time. Albeit, if you know that this is Omega's procedure, the rest of the time you should figure out what you would have done if Omega said "you're wrong" and then do that. This kind of thinking is, I think, outside the domain of current TDT, because it involves strategies that depend on actions you would have taken in counterfactual branches. I think it may even be outside the domain of current UDT for the same reason.

2Wei Dai15y

I don't see why this is outside of UDT's domain. It seems straightforward to model and solve the decision problem in UDT1. Here's the world program: def P(color): outcome = "die" if Omega_Predict(S, "you're wrong") == color: if S("") == color: outcome = "live" else: if S("you're wrong") == color: outcome = "live" Assuming a preference to maximize the occurrence of outcome="live" averaged over P("green") and P("red"), UDT1 would conclude that the optimal S returns a constant, either "green" or "red", and do that. BTW, do you find this "world program" style analysis useful? I don't want to over-do them and get people annoyed. (I refrained from doing this for the problem described in Gary's post, since it doesn't mention UDT at all, and therefore I'm assuming you want to find a TDT-only solution.)

2Gary_Drescher15y

Yes, I was focusing on a specific difficulty in TDT, But I certainly have no objection to bringing UDT into the thread too. (I myself haven't yet gotten around to giving UDT the attention I think it deserves.)

0JGWeissman15y

The world program I would use to model this scenario is: def P(color): if Omega_Predict(S, "you're wrong") == color: outcome = "die" else: outcome = "live" The else branch seems unreachable, given color = S("your'e wrong) and the usual assumptions about Omega. I don't understand what your nested if statements are modeling.

1Wei Dai15y

I was modeling what Eliezer wrote in the comment that I was responding to: BTW, if you add a tab in front of each line of your program listing, it will get formatted correctly.

1JGWeissman15y

Ah, I see. Then it seems that you are really solving the problem of minimizing the probability that Omega presents this problem in the first place. What about the scenario, where Omega uses the strategy: Simulate telling the human that they got the answer wrong. Define the resulting answer as wrong, and the other as right. This is what I modeled. Thanks. Is there an easier way to get a tab into the comment input box than copy paste from an outside editor?

1Wei Dai15y

In that case it should be modeled like this: def P(color): wrong_color = Omega_Predict(S, "you're wrong") if S("you're wrong") == wrong_color: outcome = "die" else: outcome = "live" Not that I'm aware of.

3Tyrrell_McAllister15y

Are you guys talking about getting code to indent properly? You can do that by typing four spaces in front of each line. Each quadruple of spaces produces a further indentation. http://daringfireball.net/projects/markdown/syntax#precode

2Wei Dai15y

Spaces? Think of the wasted negentropy! I say we make tab the official Less Wrong indention symbol, and kick out anyone who disagrees. Who's with me? :-)

0JGWeissman15y

Hm, I think the difference in our model programs indicates something that I don't understand about UDT, like a wrong assumption that justified an optimization. But it seems they both produce the same result for P(S("you're wrong")), which is outcome="die" for all S. Do you agree that this problem is, and should remain, unsolvable? (I understand "should remain unsolvable" to mean that any supposed solution must represent some sort of confusion about the problem.)

0Wei Dai15y

The input to P is supposed to contain the physical randomness in the problem, so P(S("you're wrong")) doesn't make sense to me. The idea is that both P("green") and P("red") get run, and we can think of them as different universes in a multiverse. Actually in this case I should have wrote "def P():" since there is no random correct color. I'm not quite sure what you mean here, but in general I suggest just translating the decision problem directly into a world program without trying to optimize it. No, like I said, it seems pretty straightforward to solve in UDT. It's just that even in the optimal solution you still die.

0JGWeissman15y

Ok, now I understood why you wrote your program the way you did. By solve, I meant find a way to win. I think that after getting past different word use, we agree on the nature of the problem.

0Psy-Kosh15y

Fair enough. I'm not sure the algorithm you describe here is necessarily outside current TDT though. The counterfactual still corresponds to an actual thing Omega simulated. It'd be more like this: Omega did not add the "you are wrong" prefix. Therefore, conditioning on the idea that Omega always tries simulating with that prefix and only states the prefix if I (or whoever Omega is offering the challenge to) was wrong in that simulation, the simulation in question then did not produce the wrong answer. Therefore a sufficient property for a good answer (one with higher expected utility) is that it should have the same output as that simulation. Therefore determine what that output was... ie, TDT shouldn't have much more problem (in principle) with that than with being told that it needs to guess the Nth digit of Pi. If possible, it would simply compute the Nth digit of Pi. In this case, it has to simply compute the outcome of a certain different algorithm which happens to be equivalent to its own decision algorithm when faced with a certain situation. I don't THINK this would be inherently outside of current TDT as I understand it I may be completely wrong on this, though, but that's the way it seems to me. As far as stuff like the problem in the OP, I suspect though that the Right Way for dealing with things analogous to counterfactual mugging (and extended to the problem in the OP) and such amounts to a very general precommitment... Or a retroactive precommitment. My thinking here is rather fuzzy. I do suspect though that the Right Way probably looks something like the the TDT, in advance, doing a very general precommitment to be the sort of being that tends to have high expected utility when faced with counterfactual muggers and whatnot... (Or retroactively deciding to be the sort of being that effectively has the logical implication of being mathematically "precommited" to be such.)

2Eliezer Yudkowsky15y

By "unsolvable" I mean that you're screwed over in final outcomes, not that TDT fails to have an output. The interesting part of the problem is that, whatever you decide, you deduce facts about the background such that you know that what you are doing is the wrong thing. However, if you do anything differently, you would have to make a different deduction about the background facts, and again know that what you were doing was the wrong thing. Since we don't believe that our decision is capable of affecting the background facts, the background facts ought to be a fixed constant, and we should be able to alter our decision without affecting the background facts... however, as soon as we do so, our inference about the unalterable background facts changes. It's not 100% clear how to square this with TDT.

4Unknowns15y

This is like trying to decide whether this statement is true: "You will decide that this statement is false." There is nothing paradoxical about this statement. It is either true or false. The only problem is that you can't get it right.

1gregconen15y

Actually, there is an optimal solution to this dilemma. Rather than use any internal process to decide, using a truly random process gives a 50% chance of survival. If you base your decision on a quantum randomness source, in principle no simulation can predict your choice (or rather, a complete simulation would correctly predict you fail in 50% of possible worlds). Knowing how to use randomness against an intelligent adversary is important.

0loqi15y

Gary postulated an infallible simulator, which presumably includes your entire initial state and all pseudorandom algorithms you might run. Known quantum randomness methods can only amplify existing entropy, not manufacture it ab initio. So you have no recourse to coinflips. EDIT: Oops! pengvado is right. I was thinking of the case discussed here, where the random bits are provided by some quantum black box.

4pengvado15y

Quantum coinflips work even if Omega can predict them. It's like a branch-both-ways instruction. Just measure some quantum variable, then measure a noncommuting variable, and voila, you've been split into two or more branches that observe different results and thus can perform different strategies. Omega's perfect predictor tells it that you will do both strategies, each with half of your original measure. There is no arrangement of atoms (encoding the right answer) that Omega can choose in advance that would make both of you wrong.

3wedrifid15y

I agree, and for this reason whenever I make descriptions I make Omega's response to quantum smart-asses and other randomisers explicit and negative.

5gregconen15y

If Omega wants to smack down the use of randomness, I can't stop it. But there are a number of game theoretic situations where the optimal response is random play, and any decision theory that can't respond correctly is broken.

0wedrifid15y

Does putting the 'quantum' in a black box change anything?

0loqi15y

Not sure I know which question you're asking: 1. A black box RNG is still useless despite being based on a quantum mechanism, or 2. That a quantum device will necessarily manufacture random bits. Counterexamples to 2 are pretty straightforward (quantum computers), so I'm assuming you mean 1. I'm operating at the edge of my knowledge here (as my original mistake shows), but I think the entire point of Pironio et al's paper was that you can verify random bits obtained from an adversary, subject to the conditions: * Bell inequality violations are observable (i.e., it's a quantum generator). * The adversary can't predict your measurement strategy. Am I misunderstanding something?

0Gary_Drescher15y

Oh ok. So it's unsolvable in the same sense that "Choose red or green. Then I'll shoot you." is unsolvable. Sometimes choice really is futile. :) [EDIT: Oops, I probably misunderstood what you're referring to by "screwed over".] Yes, assuming that you're the sort of algorithm that can (without inconsistency) know its own choice here before the choice is executed. If you're the sort of algorithm that may revise its intended action in response to the updated deduction, and if you have enough time left to perform the updated deduction, then the (previously) intended action may not be reliable evidence of what you will actually do, so it fails to provide sound reason for the update in the first place.

3SilasBarta15y

If mathematical truths were drawn in a DAG graph, it's unclear how counterfactuals would work. Since math is consistent, then, by the principle of explosion, the inversion of any statement makes all statements true. The counterfactual graph would therefore be completely uninformative. Or, perhaps, it would just generate another system of math. But then you have to know the inferential relationship between that new math and the rest of the world.

2IlyaShpitser15y

I don't see how logical entailment acts as functional causal dependence in Pearl's account of causation. Can you explain?

2Eliezer Yudkowsky15y

Pearl's account doesn't include logical uncertainty at all so far as I know, but I made my case here http://lesswrong.com/lw/15z/ingredients_of_timeless_decision_theory/ that Pearl's account has to be modified to include logical uncertainty on purely epistemic grounds, never mind decision theory. If this isn't what you're asking about then please further clarify the question?

7IlyaShpitser15y

Treating same inputs on duplicate functions also arises in the treatment of counterfactuals (since one duplicates the causal graph across worlds of interest). The treatment I am familiar with is systematic merges of portions of the counterfactual graph which can be proved to be the same. I don't really understand why this issue is about logic (rather than about duplication). What was confusing me, however, was the remark that it is possible to create causal graphs of mathematical facts (presumably with entailment functioning as a causal relationship between facts). I really don't see how this can be done. In particular the result is highly cyclic, infinite for most interesting theories, and it is not clear how to define interventions on such graphs in a satisfactory way.

0[anonymous]15y

I was going to suggest (2) myself, but then I realized that it seems to follow directly from your definition of "dependent on", so you must have thought of it yourself:

[-]JGWeissman15y40

I think this problem is based (at least in part) on an incoherence in the basic transparent box variant of Newcomb's problem.

If the subject of the problem will two-box if he sees the big box has the million dollars, but will one-box if he sees the big box is empty. Then there is no action Omega could take to satisfy the conditions of the problem.

In this variant that introduces the digit of pi, there is an unknown bit such that whatever strategy the subject takes, there is a value of that bit that allows Omega an action consistant with the conditions. Howev... (read more)

5Gary_Drescher15y

The rules of the transparent-boxes problem (as specified in Good and Real) are: the predictor conducts a simulation that tentatively presumes there will be $1M in the large box, and then puts $1M in the box (for real) iff the simulation showed one-boxing. So the subject you describe gets an empty box and one-boxes, but that doesn't violate the conditions of the problem, which do not require the empty box to be predictive of the subject's choice.

0JGWeissman15y

I drew a causal graph of this scenario (with the clarification you just provided), and in order to see the problem with TDT you describe, I would have to follow a causation arrow backwards, like in Evidential Decision Theory, which I don't think is how TDT handles counterfactuals.

1Gary_Drescher15y

The backward link isn't causal. It's a logical/Platonic-dependency link, which is indeed how TDT handles counterfactuals (i.e., how it handles the propagation of "surgical alterations" to the decision node C).

0JGWeissman15y

My understanding of the link in question, is that the logical value of the digit of pi causes Omega to take the physical action of putting the money in the box. See Eliezer's second approach:

0[anonymous]15y

My original post addressed Eliezer's original specification of TDT's sense of "logical dependency", as quoted in the post. I don't think his two proposals for revising TDT are pinned down enough yet to be able to tell what the revised TDTs would decide in any particular scenario. Or at least, my own understanding of the proposals isn't pinned down enough yet. :)

0JGWeissman15y

Ah, I was working from different assumptions. That at least takes care of the basic clear box variant. I will have to think about the digit of pi variation again with this specification.

2Eliezer Yudkowsky15y

In this case the paradox lies within having made a false statement about Omega, not about TDT. In other words, it's not a problem with the decision theory, but a problem with what we supposedly believe about Omega. But yes, whenever you suppose that the agent can observe an effect of its decision before making that decision, there must be given a consistent account of how Omega simulates possible versions of you that see different versions of your own decision, and on that basis selects at least one consistent version to show you. In general, I think, maximizing may require choosing among possible strategies for sets of conditional responses. And this indeed intersects with some of the open issues in TDT and UDT. This is what I was alluding to by saying, "The exact details here will depend on how I believe the simulator chose to tell me this".

0JGWeissman15y

Yes, that is what I meant. In considering this problem, I was wondering if it had to do with the directions of arrows on the causal graph, or a distinction between the relationships directly represented in the graph and those that can be derived by reasoning about the graph, but this false statement about Omega is getting in my way of investigating this.

[-]LauraABJ15y40

I'm not clear at all what the problem is, but it seems to be symantic. It's disturbing that this post can get 17 upvotes with almost no (2?) comments actually referring to what you're saying- indicating that no one else here really gets the point either.

It seems you have an issue with the word 'dependent' and the definition that Eliezer provided. Under that definition, E (the ith digit of pi) would be dependent on C (our decision to one or two box) if we two-boxed and got a million dollars, because then we would know that E = 0, and we would not have kno... (read more)

5Gary_Drescher15y

Sorry, the above post omits some background information. If E "depends on" C in the particular sense defined, then the TDT algorithm mandates that when you "surgically alter" the output of C in the factored causal graph, you then you must correspondingly surgically alter the output of E in the graph. So it's not at all a matter of any intuitive connotation of "depends on". Rather, "depends on", in this context, is purely a technical term that designates a particular test that the TDT algorithm performs. And the algorithm's prescribed use of that test culminates in the algorithm making the wrong decision in the case described above (namely, it tells me to two-box when I should one-box).

2LauraABJ15y

No, I still don't get why adding in the ith digit of pi clause changes Newcome's problem at all. If omega says you'll one-box and you two-box then omega was wrong, plain and simple. The ith digit of pi is an independent clause. I don't see how one's desire to make i=0 by two-boxing after already getting the million is any different than one wanting to make omega wrong by two-boxing after getting the million. If you are the type of person who, after getting the million thinks, "Gee, I want i=0! I'll two-box!" Then omega wouldn't have given you the million to begin with. After determining that he would not give you the million, he'd look at the ith digit of pi and either put the million in or not. You two-boxing has nothing to do with i.

1Gary_Drescher15y

If D=false and E=true and there's $1M in the box and I two-box, then (in the particular Newcomb's variant described above) the predictor is not wrong. The predictor correctly computed that (D xor E) is true, and set up the box accordingly, as the rules of this particular variant prescribe.

0LauraABJ15y

Yes- but your two-boxing didn't cause i=0, rather the million was there because i=0. I'm saying that if (D or E) = true and you get a million dollars, and you two-box, then you haven't caused E=0. E=0 before you two boxed, or if it did not, then omega was wrong and thought D = onebox, when in fact you are a two-boxer.

2Gary_Drescher15y

Everything you just said is true.* Everything you just said is also consistent with everything I said in my original post. *Except for one typo: you wrote (D or E) instead of (D xor E).

2whpearson15y

I'm in the same confused camp as Laura. This paragraph confuses me. Why is it the wrong decision? If Omega can perfectly predict the TDT and TDT sees 1 million dollars, then the TDT must be in a world that the ith digit of PI is 0. It is an unlikely world, to be sure.

2Gary_Drescher15y

Actually, you're in a different camp than Laura: she agrees that it's incorrect to two-box regardless of any preference you have about the specified digit of pi. :) The easiest way to see why two-boxing is wrong is to imagine a large number of trials, with a different chooser, and a different value of i, for each trial. Suppose each chooser strongly prefers that their trial's particular digit of pi be zero. The proportion of two-boxer simulations that end up with the digit equal to zero is no different than the proportion of one-boxer simulations that end up with the digit equal to zero (both are approximately .1). But the proportion of the one-boxer simulations that end up with an actual $1M is much higher (.9) than the proportion of two-boxer simulations that end up with an actual $1M (.1).

0Tyrrell_McAllister15y

But the proportion of two-boxers that saw $1M in the box that end up * with their digit being 0 and * with the $1M is even higher (1). I already saw the $1M, so, by two-boxing, aren't I just choosing to be one of those who see their E module output True?

2Gary_Drescher15y

Not if a counterfactual consequence of two-boxing is that the large box (probably) would be empty (even though in fact it is not empty, as you can already see). That's the same question that comes up in the original transparent-boxes problem, of course. We probably shouldn't try to recap that whole debate in the middle of this thread. :)

0Tyrrell_McAllister15y

Don't worry; I don't want to do that :). If I recall the original transparent-boxes problem correctly, I agree with you on what to do in that case. Just to check my memory, in the original problem, there are two transparent boxes, A and B. You see that A contains $1M and B contains $1000. You know that B necessarily contains $1000, but A would have contained $1M iff it were the case that you will decide to take only A. Otherwise, A would have been empty. The conclusion (with which I agree) is that you should take only A. Is that right? (If I'm misremembering something crucial, is there a link to the full description online?) [ETA: I see that you added a description to your post. My recollection above seems to be consistent with your description.] In the original problem, if we use the "many choosers" heuristic, there are no choosers who two-box and yet who get the $1M. Therefore, you cannot "choose to be" one of them. This is why two-boxing should have no appeal to you. In contrast, in your new problem, there are two-boxers who get the $1M and who get their E module to output True. So you can "choose to be" one of them, no? And since they're the biggest winners, that's what you should do, isn't it?

0whpearson15y

Have some Omega thought experiments been one shot, never to be repeated type deals or is my memory incorrect? Yes I wasn't thinking through what would happen when the ith digit wasn't 0. You can't switch to one boxing in that case because you don't know when that would be, or rather when you see an empty box you are forced to do the same as when you see a full box due to the way the game is set up.

1Gary_Drescher15y

Yes, and that's the intent in this example as well. Still, it can be useful to look at the expected distribution of outcomes over a large enough number of trials that have the same structure, in order to infer the (counterfactual) probabilities that apply to a single trial.

0wedrifid15y

Yes, they have. And most can be formulated as such as long as p(Omega is honest) is given as 'high' somewhere.

[-]Wei Dai15y30

In UDT1, I would model this problem using the following world program. (For those not familiar with programming convention, 0=False, and 1=True.)

def P(i):
    E = (Pi(i) == 0)
    D = Omega_Predict(S, i, "box contains $1M")
    if D ^ E:
        C = S(i, "box contains $1M")
        payout = 1001000 - C * 1000 + E * 1e9
    else:
        C = S(i, "box is empty")
        payout = 1000 - C * 1000 + E * 1e9

We then ask, what function S maximizes the expected payout at the end of P? When S sees "box is empty" clearly it ... (read more)

3Chris_Leong7y

I can't follow the payouts here. For example: 1001000 - C * 1000 + E * 1e9, seems to indicate that the payout could be over $2 million. How is that possible?

3Wei Dai7y

The "E * 1e9" (note that 1e9 is a billion) part is supposed to model "Thus, if I happen to have a strong enough preference that E output True". Does that help?

1Chris_Leong7y

Ah, thanks, that makes sense now!

3Gary_Drescher15y

That's very elegant! But the trick here, it seems to me, lies in the rules for setting up the world program in the first place. First, the world-program's calling tree should match the structure of TDT's graph, or at least match the graph's (physically-)causal links. The physically-causal part of the structure tends to be uncontroversial, so (for present purposes) I'm ok with just stipulating the physical structure for a given problem. But then there's the choice to use the same variable S in multiple places in the code. That corresponds to a choice (in TDT) to splice in a logical-dependency link from the Platonic decision-computation node to other Platonic nodes. In both theories, we need to be precise about the criteria for this dependency. Otherwise, the sense of dependency you're invoking might turn out to be wrong (it makes the theory prescribe incorrect decisions) or question-begging (it implicitly presupposes an answer to the key question that the theory itself is supposed to figure out for us, namely what things are or are not counterfactual consequences of the decision-computation). So the question, in UDT1, is: under what circumstances do you represent two real-world computations as being tied together via the same variable in a world-program? That's perhaps straightforward if S is implemented by literally the same physical state in multiple places. But as you acknowledge, you might instead have distinct Si's that diverge from one another for some inputs (though not for the actual input in this case). And the different instances need not have the same physical substrate, or even use the same algorithm, as long as they give the same answers when the relevant inputs are the same, for some mapping between the inputs and between the outputs of the two Si's. So there's quite a bit of latitude as to whether to construe two computations as "logically equivalent". So, for example, for the conventional transparent-boxes problem, what principle tells us to form

0Wei Dai15y

First, to clear up a possible confusion, the S in my P is not supposed to be a variable. It's a constant, more specifically a piece of code that implements UDT1 itself. (If I sometimes talk about it as if it's a variable, that's because I'm trying to informally describe what is going on inside the computation that UDT1 does.) For the more general question of how do we know the structure of the world program, the idea is that for an actual AI, we would program it to care about all possible world programs (or more generally, mathematical structures, see example 3 in my UDT1 post, but also Nesov's recent post for a critique). The implementation of UDT1 in the AI would then figure out which world programs it's in by looking at its inputs (which would contain all of the AI's memories and sensory data) and checking which world programs call it with those inputs. For these sample problems, the assumption is that somehow Omega has previously provided us with enough evidence for us to trust its word on what the structure of the current problem is. So in the actual P, 'S(i, "box contains $1M")' is really something like 'S(memories, omegas_explanations_about_this_problem, i, "box contains $1M")' and these additional inputs allow S to conclude that it's being invoked inside this P, and not some other world program. (An additional subtlety here is that if we consider all possible world programs, there are bound to be some other world programs where S is being called with these exact same inputs, for example ones where S is being instantiated inside a Boltzmann brain, but presumably those worlds/regions have very low weights, meaning that the AI doesn't care much about them.) Let me know if that answers your questions/concerns. I didn't answer you point by point because I'm not sure which questions/concerns remain after you see my general answers. Feel free to repeat anything you still want me to answer.

0jimrandomh15y

Then it should be S(P), because S can't make any decisions without getting to read the problem description.

0Vladimir_Nesov15y

Note that since our agent is considering possible world-programs, these world-programs are in some sense already part of the agent's program (and the agent is in turn part of some of these world-programs-inside-the-agent, which reflects recursive character of the definition of the agent-program). The agent is a much better top-level program to consider than all-possible-world-programs, which is even more of a simplification if these world-programs somehow "exist at the same time". When the (prior) definition of the world is seen as already part of the agent, a lot of the ontological confusion goes away.

2[anonymous]15y

def P1(i): const S1; E = (Pi(i) == 0) D = Omega_Predict(S1, i, "box contains $1M") if D ^ E: C = S(i, "box contains $1M") payout = 1001000 - C * 1000 else: C = S(i, "box is empty") payout = 1000 - C * 1000 (along with a similar program P2 that uses constant S2, yielding a different output from Omega_Predict)? This alternative formulation ends up telling us to two-box. In this formulation, if S and S1 (or S and S2) are in fact the same, they would (counterfactually) differ if a different answer (than the actual one) were output from S—which is precisely what a causalist asserts. (A similar issue arises when deciding what facts to model as “inputs” to S—thus forbidding S to “know” those facts for purposes of figuring out the counterfactual dependencies—and what facts to build instead into the structure of the world-program, or to just leave as implicit background knowledge.) So my concern is that UDT1 may covertly beg the question by selecting, among the possible formulations of the world-program, a version that turns out to presuppose an answer to the very question that UDT1 is intended to figure out for us (namely, what counterfactually depends on the decision-computation). And although I agree that the formulation you've selected in this example is correct and the above alternative formulation isn't, I think it remains to explain why. (As with my comments about TDT, my remarks about UDT1 are under the blanket caveat that my grasp of the intended content of the theories is still tentative, so my criticisms may just reflect a misunderstanding on my part.)

0jimrandomh15y

It seems to me that the world-program is part of the problem description, not the analysis. It's equally tricky whether it's given in English or in a computer program; Wei Dai just translated it faithfully, preserving the strange properties it had to begin with.

1Gary_Drescher15y

My concern is that there may be several world-programs that correspond faithfully to a given problem description, but that correspond to different analyses, yielding different decision prescriptions, as illustrated by the P1 example above. (Upon further consideration, I should probably modify P1 to include "S()=S1()" as an additional input to S and to Omega_Predict, duly reflecting that aspect of the problem description.)

4jimrandomh15y

If there are multiple translations, then either the translations are all mathematically equivalent, in the sense that they agree on the output for every combination of inputs, or the problem is underspecified. (This seems like it ought to be the definition for the word underspecified. It's also worth noting that all game-theory problems are underspecified in this sense, since they contain an opponent you know little about.) Now, if two world programs were mathematically equivalent but a decision theory gave them different answers, then that would be a serious problem with the decision theory. And this does, in fact, happen with some decision theories; in particular, it happens to theories that work by trying to decompose the world program into parts, when those parts are related in a way that the decision theory doesn't know how to handle. If you treat the world-program as an opaque object, though, then all mathematically equivalent formulations of it should give the same answer.

1Gary_Drescher15y

I assume (please correct me if I'm mistaken) that you're referring to the payout-value as the output of the world program. In that case, a P-style program and a P1-style program can certainly give different outputs for some hypothetical outputs of S (for the given inputs). However, both programs's payout-outputs will be the same for whatever turns out to be the actual output of S (for the given inputs). P and P1 have the same causal structure. And they have the same output with regard to (whatever is) the actual output of S (for the given inputs). But P and P1 differ counterfactually as to what the payout-output would be if the output of S (for the given inputs) were different than whatever it actually is. So I guess you could say that what's unspecified are the counterfactual consequences of a hypothetical decision, given the (fully specified) physical structure of the scenario. But figuring out the counterfactual consequences of a decision is the main thing that the decision theory itself is supposed to do for us; that's what the whole Newcomb/Prisoner controversy boils down to. So I think it's the solution that's underspecified here, not the problem itself. We need a theory that takes the physical structure of the scenario as input, and generates counterfactual consequences (of hypothetical decisions) as outputs. PS: To make P and P1 fully comparable, drop the "E*1e9" terms in P, so that both programs model the conventional transparent-boxes problem without an extraneous pi-preference payout.

0jimrandomh15y

This conversation is a bit confused. Looking back, P and P1 aren't the same at all; P1 corresponds to the case where Omega never asks you for any decision at all! If S must be equal to S1 and S1 is part of the world program, then S must be part of the world program, too, not chosen by the player. If choosing an S such that S!=S1 is allowed, then it corresponds to the case where Omega simulates someone else (not specified). The root of the confusion seems to be that Wei Dai wrote "def P(i): ...", when he should have written "def P(S): ...", since S is what the player gets to control. I'm not sure where making i a parameter to P came from, since the English description of the problem had i as part of the world-program, not a parameter to it.

[-]MrHen15y20

TDT is Timeless Decision Theory. It wouldn't be bad to say that in the first paragraph somewhere.

EDIT: Excellent. Thanks.

2Gary_Drescher15y

Done.

3Cyan15y

Can you fix the font size issue too?

0Gary_Drescher15y

Hm, sorry, it's displaying for me in the same size as the rest of the site, so I'm not sure what you're seeing. I'll strip the formatting and see if that helps.

0Cyan15y

For me, the text within "You treat your choice... probability distributions over D" and "If that's what TDT... the specified digit is zero" show up in 7.5 point font.

1Gary_Drescher15y

Better now?

0thomblake15y

That fixed it

0[anonymous]15y

Ugh. I removed the formatting, and now it displays for me with large vertical gaps between the paragraphs.

[-]lukstafi14y10

I suggest adding a link to this discussion to the TDT wiki entry.

[-]Violet15y10

So let's say I'm confronted with this scenario, and I see $1M in the large box.

So lets get the facts:
1) There is $1M in the large box and thus (D xor E)=true
2) I know that I am an one boxing agent
3) Thus D="one boxing"
4) Thus I know D/=E since the xor is true
5) I one-box and live happily with $1,000,000

When Omega simulates me with the same scenario and without lying there is no problem.

Seems like much of the mindgames are hindered by simply precommitting to choices.

For the red-and-green just toss a coin (or whatever choice of randomness you have).

[-]Roko15y10

We could make an ad-hoc repair to TDT by saying that you're not allowed to infer from a logical fact to another logical fact going via a physical (empirical) fact.

In this case, the mistake happened because we went from "My decision algorithm's output" (Logical) to "Money in box" (Physical) to "Digits of Pi" (Logical), where the last step involved following an arrow on a causal graph backwards: The digits of Pi has a causal arrow going into the "money in box" node.

The TDT dependency inference could be implemented by... (read more)

3Joanna Morningstar15y

This ad-hoc fix breaks as soon as Omega makes a slightly messier game, wherein you receive a physical clue as to a computation output, and this computation and your decision determine your reward. Suppose that for any output of the computation there is a a unique best decision, and that furthermore this set of (computation output, predicted decision) pairs are mapped to distinct physical clues. Then given the clue you can infer what decision to make and the logical computation, but this requires that you infer from a logical fact (the predictor of you) to the physical state to the clue to the logical fact of the computation.

0Roko15y

Can you provide a concrete example? (because I think that a series of fix-example-fix ... cases might get us to the right answer)

1Joanna Morningstar15y

The game is to pick a box numbered from 0 to 2; there is a hidden logical computation E yielding another value 0 to 2. Omega has a perfect predictor D of you. You choose C. The payout is 10^((E+C)mod 3), and there is a display showing the value of F = (E-D)mod 3. If F = 0, then: * D = 0 implies E = 0 implies optimal play is C = 2; contradiction * D = 1 implies E = 1 implies optimal play is C = 1; no contradiction * D = 2 implies E = 2 implies optimal play is C = 0; contradiction And similarly for F = 1, F = 2 play C = F+1 as the only stable solution (which nets you 100 per play) If you're not allowed to infer anything about E from F, then you're faced with a random pick from winning 1, 10 or 100, and can't do any better...

0Wei Dai15y

I'm not sure this game is well defined. What value of F does the predictor D see? (That is, it's predicting your choice after seeing what value of F?)

0Joanna Morningstar15y

The same one that you're currently seeing; for all values of E there is a value of F such that this is consistent, ie that D has actually predicted you in the scenario you currently find yourself in.

2Vladimir_Nesov15y

The logical/physical distinction itself can be seen as ad-hoc: you can consider the whole set-up Q as a program that is known to you (R), because the rules of the game were explained, and also consider yourself (R) as a program known to Q. Then, Q can reason about R in interaction with various situations (that is, run, given R, but R is given as part of Q, so "given R" doesn't say anything about Q), and R can do the same with Q (and with the R within that Q, etc.). Prisoner's dilemma can also be represented in this way, even though nobody is pulling Omega in that case. When R is considering "the past", it in fact considers facts about Q, which is known to R, and so facts about the past can be treated as "logical facts". Similarly, when these facts within Q reach R at present and interact with it, they are no more "physical facts" than anything else in this setting (these interactions with R "directed from the past" can be seen as what R predicts Q-within-R-within-Q-... to do with R-within-Q-within-R-...).

[-]Will_Sawin14y00

Does ADT solve this particular issue?

[-]Morendil15y00

Consider the following version of the transparent-boxes scenario.

I'm trying to get a grip on what this post is about, but I don't know enough of the literature about Newcomb's Problem to be sure what is referred to here by "the transparent-boxes scenario". Can someone who knows briefly recap the baseline scenario of which this is a version?

[-][anonymous]15y00

So let's say I'm confronted with this scenario, and I see $1M in the large box.

So lets get the facts:

There is $1M in the large box and thus (D xor E)=true
I know that I am an one boxing agent
Thus D="one boxing"
Thus I know D/=E (since xor=true)
I one-box and live happily with $1,000,000

When Omega simulates me with the same scenario and without lying there is no problem.

Seems like much of the mindgames are hindered by simply precommitting to choices.

For the red-and-green just toss a coin (or whatever choice of randomness you have).

[-]Tyrrell_McAllister15y00

I have a question that is probably stupid and/or already discussed in the comments. But I don't have time to read all the comments, so, if someone nonetheless would kindly explain why I'm confused, I would be grateful.

The OP writes

So E does indeed "depend on" C, in the particular sense you've specified. Thus, if I happen to have a strong enough preference that E output True, then TDT (as currently formulated) will tell me to two-box for the sake of that goal. But that's the wrong decision, of course. In reality, I have no choice about the spec

... (read more)

[-][anonymous]15y00

So let's say I'm confronted with this scenario, and I see $1M in the large box.

So lets get the facts:

There is $1M in the large box and thus (D xor E)=true
I know that I am an one boxing agent
Thus D="one boxing"
Thus I know D/=E since the xor is true
I one-box and live happily with $1,000,000

When Omega simulates me with the same scenario and without lying there is no problem.

Seems like much of the mindgames are hindered by simply precommitting to choices.

For the red-and-green just toss a coin (or whatever choice of randomness you have).

[-][anonymous]15y00

So let's say I'm confronted with this scenario, and I see $1M in the large box.

So lets get the facts:

There is $1M in the large box and thus (D xor E)=true
I know that I am an one boxing agent
Thus D="one boxing"
Thus I know D/=E since xor=true
I one-box and live happily with $1,000,000

When Omega simulates me with the same scenario and without lying there is no problem.

Seems like much of the mindgames are hindered by simply precommitting to choices.

For the red-and-green just toss a coin (or whatever choice of randomness you have).

[-]wedrifid15y00

Let:

M be 'There is $1 in the big box'

When:

D(M) = true, D(!M) = true, E = true

Omega fails.

D(M) = true, D(!M) = true, E = false

Omega chooses M or !M. I get $1M or 0.

D(M) = true, D(!M) = false, E = true

Omega chooses M=false. I get $0.1.

D(M) = true, D(!M) = false, E = false

Omega chooses M=true. I get $1M.

D(M) = false, D(!M) = false, E = true

M chooses either M or !M. I get either $1.1 or $0.1 depending on Omega's whims

D(M) = false, D(!M) = false, E = false

Omega has no option. I make Omega look like a fool.

So, depending on how 'Omega ... (read more)

1Gary_Drescher15y

No, but it seems that way because I neglected in my OP to supply some key details of the transparent-boxes scenario. See my new edit at the end of the OP.

0wedrifid15y

So, with those details, that resolves to "I get $0". This makes D(M) = !M the unambiguous 'correct' decision function.

[-]SarahNibs15y00

First thought: We can get out of this dilemma by noting that the output of C also causes the predictor to choose a suitable i, so that saying we cause the ith digit of pi to have a certain value is glossing over the fact that we actually caused the i[C]th digit of pi to have a certain value.

0Tyrrell_McAllister15y

How's that? Any i that is sufficiently large is suitable. It doesn't depend on the output of C. It just needs to be beyond C's ability to learn anything beyond the ignorance prior regarding the i-th digit of π.

0SarahNibs15y

I've finally figured out where my intuition on that was coming from (and I don't think it saves TDT). Suppose for a moment you were omniscient except about the relative integrals Vk (1) over measures of the components of the wavefunction which * had a predictor that chose an i such that pi[i] = k * would evolve into components with a you (2) where the predictor would present the boxes, question, etc to you, but would not tell you its choice of i. Here my ignorance prior on pi[x] for large values of x happens to be approximately equivalent to your ignorance prior over a certain ratio of integrals (relative "sum" of measures of relevant components). When you implement C = one-box, you choose that the relative sum of measures of you that gets $0, $1000, $1000,000, and $1001,000 is (3): * $0: 0 * $1000: V0 * $1000000: (1-V0) * $1001000: 0 whereas when you implement C = two-box, you get * $0: 0 * $1000: (1-V0) * $1000000: 0 * $1001000: V0 If your preferences over wavefunctions happens to include a convenient part that tries to maximize the expected integral of dollars you[k] gets times measure of you[k], you probably one-box here, just like me. And now for you it's much more like you're choosing to have the predictor pick a sweet i 9/10 of the time. (1) by relative integral I mean instead of Wk, you know Vk = Wk/(W0+W1+...+W9) (2) something is a you when it has the same preferences over solutions to the wavefunction as you and implements the same decision theory as you, whatever precisely that means (3) this bit only works because the measure we're using, the square of the modulus of the amplitude, is preserved under time-evolution Some related questions and possible answers below.

0SarahNibs15y

I wonder if that sort of transform is in general useful? Changing your logical uncertainty into an equivalent uncertainty about measure. For the calculator problem you'd say you knew exactly the answer to all multiplication problems, you just didn't know what the calculators had been programmed to calculate. So when you saw the answer 56,088 on your Mars calculator, you'd immediately know that your Venus calculator was flashing 56,088 as well (barring asteroids, etc). This information does not travel faster than light - if someone typed 123x456 on your Mars calculator while someone else typed 123x456 on your Venus calculator, you would not know that they were both flashing 56,088 - you'd have to wait until you learned that they both typed the same input. Or if you told someone to think of an input, then tell someone else who would go to Venus and type it in there, you'd still have to wait for them to get to Venus (which they can do a light speed, whynot). How about whether P=NP, then? No matter what, once you saw 56,088 on Mars you'd know the correct answer to "what's on the Venus calculator?" But before you saw it, your estimate of the probability "56,088 is on the Venus calculator" would depend on how you transformed the problem. Maybe you knew they'd type 123x45?, so your probability was 1/10. Maybe you knew they'd type 123x???, so your probability was 1/1000. Maybe you had no idea so you had a sort of a complete ignorance prior. I think this transform comes down to choosing appropriate reference classes for your logical uncertainty.

0SarahNibs15y

Why would you or I have such a preference that cares about my ancestor's time-evolved descendants rather than just my time-evolved descendants? My guess is that * a human's preferences are (fairly) stable under time-evolution, and * the only humans that survive are the ones that care about their descendants, and * humans that we see around us are the time-evolution of similar humans, So e.g. I[now] care approximately about what I[5-minutes-ago] cared about, and I[5-minutes-ago] didn't just care about me[now], he also cared about me[now-but-in-a-parallel-branch].

[-]rwallace15y-10

In the setup in question, D goes into an infinite loop (since in the general case it must call a copy of C, but because the box is transparent, C takes as input the output of D).

In Eliezer's similar red/green problem, if the simulation is fully deterministic and the initial conditions are the same, then the simulator must be lying, because he must've told the same thing to the first instance, at a time when there had been no previous copy. (If those conditions do not hold, then the solution is to just flip a coin and take your 50-50 chance.)

Are these still problems when you change them to fix the inconsistencies?

0Gary_Drescher15y

No, because by stipulation here, D only simulates the hypothetical case in which the box contains $1M, which does not necessarily correspond to the output of D (see my earlier reply to JGWeissman: http://lesswrong.com/lw/1qo/a_problem_with_timeless_decision_theory_tdt/1kpk).

Moderation Log