tslarm comments on Causal decision theory is unsatisfactory - LessWrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (158)
I don't think you've misunderstood; in fact I share your position.
Do you also reject compatibilist accounts of free will? I think the basic point at issue here is whether or not a fully determined action can be genuinely 'chosen', any more than the past events that determine it.
The set of assumptions that undermines CDT also ensures that the decision process is nothing more than the deterministic consequence (give or take some irrelevant randomness) of an earlier state of the world + physical law. The 'agent' is a fully determined cog in a causally closed system.
In the same-source-code-PD, at the beginning of the decision process each agent knows that the end result will either be mutual cooperation or mutual defection, and also that the following propositions must either be all true or all false:
The agent wants Proposition 4 -- and therefore all of the other propositions -- to be true.
Since all of the propositions are known to share the same truth value, choosing to make Proposition 3 true is equivalent to choosing to make all four propositions true -- including the two that refer to past events (Propositions 1 and 2). So either the agent can choose the truth value of propositions about the past, or else Proposition 3 is not really under the agent's control.
I'd be interested to know whether those who disagree with me/us see a logical error above, or simply have a concept of choice/agency/free will/control that renders the previous paragraph either false or unproblematic (presumably because it allows you to single out Proposition 3 as uniquely under the agent's control, or it isn't so fussy about temporal order). If the latter, is this ultimately a semantic dispute? (I suspect that some will half-agree with that, but add that the incompatibilist notion of free will is at best empirically false and at worst incoherent. I think the charge of incoherence is false and the charge of empirical falsity is unproven, but I won't go into that now.)
In any case, responses would be appreciated. (And if you think I'm completely mistaken or confused, please bear in mind that I made a genuine attempt to explain my position clearly!)
Decision theories are algorithms. Free will really doesn't have much to do with them. A deterministic agent must still actually use some algorithm or another to map from sensory inputs and learned knowledge to effector outputs.
Thank you - this is exactly the point that I was trying to make, just stated much more articulately. I too would much appreciate responses to this, it would help me resolve some deep confusion about why very smart LessWrongers disagree with me about something important.
Hi, sorry, I think I misunderstood what you were unhappy with. I have not fully internalized what is happening, but my understanding (Nate please correct me if I am wrong):
We can have a "causal graph" (which will include exotic causal rules, like Omega deciding based on your source code), and in this "causal graph" we have places where our source code should go, possibly in multiple places. These places have inputs and outputs, and based on these inputs and outputs we can choose any function we like from a (let's say finite for simplicity) set. We are choosing what goes into those places based on some optimality criterion.
But you might ask, isn't that really weird, your own source code is what is doing the choosing, how can you be choosing it? (Is that what you mean by "incoherence due to recursion?") The proposed way out is this.
What we are actually choosing is from a set of possible source codes that embeds, via quining, the entire problem and varies only in how inputs/outputs are mapped in the right places. Due to the theorem in the link it is always possible to set it up in this way.
Note: the above does not contain the string "free will," I am just trying to output my incomplete understanding of how the setup is mathematically coherent. I don't think there is any "free will" in the g-formula either.
I realize the above is still not very clear (even to myself). So here is another attempt.
The key idea seems to be using Kleene's theorem to reason about own source, while further having the ability to rewrite parts of own source code based on conclusions drawn. In this particular case, "choice" is defined as
thinking about own source code with the "causal graph" (with holes for your source code) embedded.
maximizing expected utility, and realizing it would be optimal if [a particular rule mapping inputs to outputs] should go into all the holes in the "causal graph" where our algorithm should go.
rewriting our own source such that everything stays the same, except in our representation of the "causal graph" we just do the optimal thing rather than redo the analysis again.
(???)
This entire complication goes away if you assume the problem set up just chooses input/output mappings, similarly to how we are choosing optimal treatment regimes in causal inference, without assuming that those mappings represent your own source code.
Thank you - Yes, that is what I meant by recursion, and your second attempt seems to go in the right direction to answer my concerns, but I'll have to think about this for a while to see if I can be convinced.
As for the G-Formula: I don't think there is free will in it either, just that in contrast with UDT/TDT, it is not inconsistent with my concept of free will.
As an interesting anecdote, I am a Teaching Assistant for Jamie (who came up with the G-Formula), so I have heard him lecture on it several times now. The last couple of years he brought up experiments that seemingly provide evidence against free will and promised to discuss the implications for his theory. Unfortunately, both years he ran out of time before he got around to it. I should ask him the next time I meet him.
re: free will, this is one area where Jamie and Judea agree, it seems.
I think one thing I personally (and probably others) find very confusing about UDT is reconciling the picture to be consistent with temporal constraints on causality. Nothing should create physical retro causality. Here is my posited sequence of events.
step 1. You are an agent with source code read/write access. You suspect there will be (in the future) Omegas in your environment, posing tricky problems. At this point (step 1), you realize you should "preprocess" your own source code in such a way as to maximize expected utility in such problems.
That is, for all causal graphs (possibly with Omega causal pathways), you find where nodes for [my source code goes here] are, and you "pick the optimal treatment regime".
step 2. Omega scans your source code, and puts things in boxes based on examining this code, or simulating it.
step 3. Omega gives you boxes, with stuff already in them. You already preprocessed what to do, so you one box immediately and walk away with a million.
Given that Omega can scan your source, and given that you can credibly rewrite own decision making source code, there is nothing exotic in this sequence of steps, in particular there is no retro causality anywhere. It is just that there are some constraints (what people here call "logical counterfactuals") that force the output of Omega's sim at step 2, and your output at step 3 to coincide. This constraint is what lead you to preprocess to one box in step 1, by drawing an "exotic causal graph" with Omega's sim creating an additional causal link that seemingly violates "no retro causality."
The "decision" is in step 1. Had you counterfactually decided to not preprocess there, or preprocess to something else, you would walk away poorer in step 3. There is this element to UDT of "preprocessing decisions in advance." It seems all real choice, that is examining of alternatives and picking one wisely, happens there.
(???)
This is closer to describing the self-modifying CDT approach. One of the motivations for development of TDT and UDT is that you don't necessarily get an opportunity to do such self-modification beforehand, let alone to compute the optimal decisions for all possible scenarios you think might occur.
So the idea of UDT is that the design of the code should already suffice to guarantee that if you end up in a newcomblike situation you behave "as if" you did have the opportunity to do whatever precommitment would have been useful. When prompted for a decision, UDT asks "what is the (fixed) optimal conditional strategy" and outputs the result of applying that strategy to its current state of knowledge.
Basically this, except there's no need to actually do it beforehand.
If you like, you can consider the UDT agent's code itself to be the output of such "preprocessing"... except that there is no real pre-computation required, apart from giving the UDT agent a realistic prior.
Actually, no. To implement things correctly, UDT needs to determine its entire strategy all at once. It cannot decide whether to one-box or two-box in Newcomb just by considering the Newcomb that it is currently dealing with. It must also consider all possible hypothetical scenarios where any other agent's action depends on whether or not UDT one-boxes.
Furthermore, UDT cannot decide what it does in Newcomb independently of what it does in the Counterfactual Mugging, because some hypothetical entity might give it rewards based on some combination of the two behaviors. UDT needs to compute its entire strategy (i.e. it's response to all possible scenarios) all at the same time before it can determine what it should do in any particular situation [OK. Not quite true. It might be able to prove that whatever the optimal strategy is it involves doing X in situation Y without actually determining the optimal strategy. Then again, this seems really hard since doing almost anything directly from Kolmogorov priors is basically impossible].
Conceptually, yes. The point is that you don't need to actually literally explicitly compute your entire strategy at
t=-∞. All you have to do is prove a particular property of the strategy (namely, its action in situation Y) at the time when you are asked for a decision.Obviously, like every computational activity ever, you must still make approximations, because it is usually infeasible to make inferences over the entire tegmark-IV multiverse when you need to make a decision. An example of such approximations would be neglecting the measure of "entities that give it rewards based on some combination of [newcomb's and counterfactual mugging]" in many situations because I expect such things to be rare (significantly rarer than newcomb's and counterfactual mugging themselves).
Consider the set of all possible source codes an agent might have. This set is partitioned in two: those on which Omega rewards you (where you both one box, and your source code is such that Omega can tell you will), and those on which Omega punishes you (all others). Call the former set A, and the latter set B.
Agents are not guaranteed to start with a source code in set A, some start with source code in B. (Maybe they are classically trained causal decision theorists? Maybe they are skeptical about UDT? Maybe their programmers were careless? Who knows!) The point is, there comes a time in an agent's life when it needs to grow up and move its source code to set A. Maybe it does not immediately self-modify to directly do UDTish things on Newcomb-like causal graphs, maybe it self-modifies to self-modify before being asked to one box.
But it is crucial for the agent to move itself from set B to set A at some point before Omega shows up. This is what I mean by step 1.
I think what confuses me is that if we want the logical connections to hold (between my decision and the decision of another agent running the same source code), it is necessary that when he preprocesses his source code he will deterministically make the same choice as me. Which means that my decision about how to preprocess has already been made by some deeper part of my source code
My understanding of the statement of Newcomb is that Omega puts things on boxes only based on what your source code says you will do when faced with input that looks like the Newcomb's problem. Since the agent already preprocessed the source code (possibly using other complicated bits of its own source code) to one box on Newcomb, Omega will populate boxes based on that. Omega does not need to deal with any other part of the agent's source code, including some unspecified complicated part that dealt with preprocessing and rewriting, except to prove to itself that one boxing will happen.
All that matters is that the code currently has the property that IF it sees the Newcomb input, THEN it will one box.
Omega that examines the agent's code before the agent preprocessed will also put a million dollars in, if it can prove the agent will self-modify to one-box before choosing the box.
Phrasing it in terms of source code makes it more obvious that this is equivalent to expecting Omega to be able to solve the halting problem.
This is fighting the hypothetical, Omega can say it will only put a million in if it can find a proof of you one boxing quickly enough.
If Omega only puts the million in if it finds a proof fast enough, it is then possible that you will one-box and not get the million.
(And saying "there isn't any such Omega" may be fighting the hypothetical. Saying there can't in principle be such an Omega is not.)
I think some that favor CDT would claim that you are are phrasing the counterfactual incorrectly. You are phrasing the situation as "you are playing against a copy of yourself" rather than "you are playing against an agent running code X (which just happens to be the same as yours) and thinks you are also running code X". If X=CDT, then TDT and CDT each achieve the result DD. If X=TDT, then TDT achieves CC, but CDT achieves DC.
In other words TDT does beat CDT in the self matchup. But one could argue that self matchup against TDT and self matchup against CDT are different scenarios, and thus should not be compared.